10,000 Matching Annotations
  1. Sep 2025
    1. Reviewer #3 (Public review):

      The manuscript investigated the kinetics of molecule transport across interfaces in phase-separated mixtures. Through the development of a theoretical approach for a binary mixture in a sharp interface limit, the authors found that interface resistance leads to a slowdown in interfacial movement. Subsequently, they extended this approach to multiple molecular species (incorporating both labeled and unlabeled molecules) and continuous transport models. Finally, they proposed experimental settings in vitro and commented on the necessary optical resolution to detect signatures of interfacial kinetics associated with resistance.

      The investigation of transport kinetics across biomolecular condensate interfaces holds significant relevance for understanding cellular function and dysfunction mechanisms; thus, the topic is important and timely. However, the current manuscript presentation requires improvement. Firstly, the inclusion of numerous equations in the main text substantially compromises readability, and relocation of a part of the formulae and derivations to the Appendix would be more appropriate. Secondly, the manuscript would benefit from more comprehensive comparisons with existing theoretical studies on molecular transport kinetics. The text should also be written to be more approachable for a general readership. Modifications and sufficient responses to the specific points outlined below are recommended.

      (1) The authors introduced a theoretical framework to study the kinetics of molecules across an interface between two coexisting liquid phases and found that interface resistance leads to a slowdown in interfacial movement in a binary mixture and a decelerated molecule exchange between labeled and unlabeled molecules across the phase boundary. However, these findings appear rather expected. The work would be strengthened by a more thorough discussion of the kinetics of molecule transport across interfaces (such as the physical origin of the interface resistance and its specific impact on transport kinetics).

      (2) The formulae in the manuscript should be checked and corrected. Notably, Equation 10 contains "\phi_2\ln\phi_2" while Eq. 11b shows "n^{-1}\ln\phi_2", suggesting a missing factor of "n^{-1}". Similarly, Equation 18 obtained from Equation 11: the logarithmic term in Eq.11a is "n^{-1}\ln phi_1-\ln(1-\phi)" but the pre-exponential factor in Equation 18a is just "\phi_1/(1-\phi*)", where is "n^{-1}"? Additionally, there is a unit inconsistency in Equation 36, where the unit of \rho (s/m) does not match that of the right-hand side expression (s/m^2).

      (3) The authors stated that the numerical solutions are obtained using a custom finite difference scheme implemented in MATLAB in the Appendix. The description of numerical methods is insufficiently detailed and needs to be expanded, including specific equations or models used to obtain specific figures, the introduction of initial and boundary conditions, the choices of parameters and their reasons in terms of the biology.

      (4) The authors claimed that their framework naturally extends to multiple molecular species, but only showed the situation of labeled and unlabeled molecules across a phase boundary. How about three or more molecular species? Does this framework still work? This should be added to strengthen the manuscript and confirm the framework's general applicability.

    2. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      In this manuscript, the authors theoretically address the topic of interface resistance between a phase-separated condensate and the surrounding dilute phase. In a nutshell, "interface resistance" occurs if material in the dilute phase can only slowly pass through the interface region to enter the dense phase. There is some evidence from FRAP experiments that such a resistance may exist, and if it does, it could be biologically relevant insofar as the movement of material between dense and dilute phases can be rate-limiting for biological processes, including coarsening. The current study theoretically addresses interface resistance at two levels of description: first, the authors present a simple way of formulating interface resistance for a sharp interface model. Second, they derive a formula for interface resistance for a finite-width interface and present two scenarios where the interface resistance might be substantial. 

      Strengths: 

      The topic is of broad relevance to the important field of intracellular phase separation, and the work is overall credible. 

      Weaknesses: 

      There are a few problems with the study as presented - mainly that the key formula for the latter section has already been derived and presented in Reference 6 (notably also in this journal), and that the physical basis for the proposed scenarios leading to a large interface resistance is not clearly supported. 

      (1) As noted, Equation 32 of the current study is entirely equivalent to Equation 8 of Reference 6, with a very similar derivation presented in Appendix 1 of that paper. In fact, Equation 8 in Reference 6 takes one more step by combining Equations 32 and 35 to provide a general expression for the interface resistance in an integral form. These prior results should be properly cited in the current work - the existing citations to Reference 6 do not make this overlap apparent. 

      We agree and will make the overlap explicit, acknowledging priority and clarifying what is new here. The initial version of the preprint of Zhang et al. (2022) (https://www.biorxiv.org/content/10.1101/2022.03.16.484641v1) lacked the derivation (it referenced a Supplementary Note not yet available); it was added during the eLife submission. We worked from the preprint and missed this update, which we will now correct.

      (2) The authors of the current study go on to examine cases where this shared equation (here Equation 32) might imply a large interface resistance. The examples are mathematically correct, but physically unsupported. In order to produce a substantial interface resistance, the current authors have to suppose that in the interface region between the dense and dilute phases, either there is a local minimum of the diffusion coefficient or a local minimum of the density. I am not aware of any realistic model that would produce either of these minima. Indeed, the authors do not present sufficient examples or physical arguments that would support the existence of such minima. 

      We respectfully disagree with the reviewer on the physical plausibility of these scenarios there is both concrete experimental and theoretical evidence for the scenarios we discussed.

      Experimental: Strom et al. (2017) (our reference 11) describes a substantially reduced protein diffusion coefficient at an in vivo phase boundary, while Hahn et al. (2011a) and Hahn et al. (2011b) (our references 27 and 28) describe transient accumulation of molecules at a phase boundary, which they attribute to the Donnan potential, but conceivably a lowered mobility could play a role.

      Theoretical: Recent work (e.g., Majee et al. (2024)) shows that charged layers could form at phase boundaries, which could either repel or attract incoming molecules, depending on their charge, thus altering the local volume fraction, resulting in a trough or peak. Arguably, the model put forth by Zhang et al. (2024) could be mapped to a potential wall, where particles are reflected, unless in a certain state. We will add sentences to the corresponding results section, as well as the discussion to make this plausibility more apparent.

      In my view, these two issues limit the general interest of the latter portion of the current manuscript. While point 1 can be remedied by proper citation, point 2 is not so simple to address. The two ways the authors present to produce a substantial interface resistance seem to me to be mathematical exercises without a physical basis. The manuscript will improve if the authors can provide examples or compelling arguments for a minimum of either diffusion coefficient or density between the dense and dilute phases that would address point 2. 

      We believe we will be able to address both issues.

      Reviewer #2 (Public review): 

      Summary: 

      This work provides a general theoretical framework for understanding molecular transport across liquid-liquid phase boundaries, focusing on interfacial resistance arising from deviations from local equilibrium. By bridging sharp and continuous interface descriptions, the authors demonstrate how distinct microscopic mechanisms can yield similar effective kinetics and propose practical experimental validation strategies. 

      Strengths: 

      (1) Conceptually rich and physically insightful interface resistance formulation in sharp and continuous limits. 

      (2) Strong integration of non-equilibrium thermodynamics with biologically motivated transport scenarios. 

      (3) Thorough numerical and analytical support, with thoughtful connection to current and emerging experimental techniques. 

      (4) Relevance to various systems, including biomolecular condensates and engineered aqueous two-phase systems. 

      Weaknesses: 

      (1) The work remains theoretical, mainly, with limited direct comparison to quantitative experimental data. 

      We agree with the reviewer, an experimental manuscript is in progress.

      (2) The biological implications are only briefly explored; further discussion of specific systems where interface resistance might play a functional role would enhance the impact.

      We thank the reviewer for this comment. We will add several such scenarios to the discussion, including the possibility to use interface resistance as a way of ordering biochemical reactions in time, as well as their potential to exclude molecules from condensates for long time periods, which, while not effective in the long-time limit, could help on cellular timescales of minutes to hours to respond to transient events.

      (3) Some model assumptions (e.g., symmetric labeling or idealized diffusivity profiles) could be further contextualized regarding biological variability. 

      The treatment of labelled and unlabelled molecules as physically identical is well supported by our experiments. Droplets under typical experimental conditions, i.e. when bleaching is not too strong, do not markedly change size or volume fraction of molecules, which would be expected if the physical properties like molecular volume or interaction strength were significantly changed. However, we do agree that in more extreme bleaching regimes the bleach step itself will change the droplet properties, but this can be avoided by tuning the FRAP laser power and dwell times accordingly.

      Our diffusivity profiles are chosen in the simplest possible way to handle typical experimental constraints (large D outside, lower D inside, potentially lowered D at the boundary) and allow for a mean-field treatment. To the best of our knowledge, the precise make-up and concentration profiles of phase boundaries in biomolecular condensates are not currently known, due to limitations in optical resolution.

      Reviewer #3 (Public review): 

      The manuscript investigated the kinetics of molecule transport across interfaces in phase-separated mixtures. Through the development of a theoretical approach for a binary mixture in a sharp interface limit, the authors found that interface resistance leads to a slowdown in interfacial movement. Subsequently, they extended this approach to multiple molecular species (incorporating both labeled and unlabeled molecules) and continuous transport models. Finally, they proposed experimental settings in vitro and commented on the necessary optical resolution to detect signatures of interfacial kinetics associated with resistance. 

      The investigation of transport kinetics across biomolecular condensate interfaces holds significant relevance for understanding cellular function and dysfunction mechanisms; thus, the topic is important and timely. However, the current manuscript presentation requires improvement. Firstly, the inclusion of numerous equations in the main text substantially compromises readability, and relocation of a part of the formulae and derivations to the Appendix would be more appropriate. Secondly, the manuscript would benefit from more comprehensive comparisons with existing theoretical studies on molecular transport kinetics. The text should also be written to be more approachable for a general readership. Modifications and sufficient responses to the specific points outlined below are recommended. 

      (1) The authors introduced a theoretical framework to study the kinetics of molecules across an interface between two coexisting liquid phases and found that interface resistance leads to a slowdown in interfacial movement in a binary mixture and a decelerated molecule exchange between labeled and unlabeled molecules across the phase boundary. However, these findings appear rather expected. The work would be strengthened by a more thorough discussion of the kinetics of molecule transport across interfaces (such as the physical origin of the interface resistance and its specific impact on transport kinetics). 

      We thank the reviewer for this comment and will discuss possible mechanisms and how they map to our meanfield model in more detail, both in the corresponding results section, and in the discussion, as also outlined in our response to Reviewer #1.

      (2) The formulae in the manuscript should be checked and corrected. Notably, Equation 10 contains "\phi_2\ln\phi_2" while Eq. 11b shows "n^{-1}\ln\phi_2", suggesting a missing factor of "n^{-1}". Similarly, Equation 18 obtained from Equation 11: the logarithmic term in Eq.11a is "n<sup>^</sup>{-1}\ln phi_1-\ln(1-\phi)" but the pre-exponential factor in Equation 18a is just "\phi_1/(1-\phi*)", where is "n<sup>^</sup>{-1}"? Additionally, there is a unit inconsistency in Equation 36, where the unit of \rho (s/m) does not match that of the right-hand side expression (s/m<sup>^</sup>2). 

      We thank the reviewer. We identified that the error originates in the inline definition of the exchange chemical potential, already before equation 11. We inadvertently dropped a prefactor of n, which then shows up in the following equation as an exponent to (1-phi<sup>^</sup>*). Very importantly this means the main result eq. 25 still holds, and in the revised manuscript we will correct the ensuing typographical mistakes.

      (3) The authors stated that the numerical solutions are obtained using a custom finite difference scheme implemented in MATLAB in the Appendix. The description of numerical methods is insufficiently detailed and needs to be expanded, including specific equations or models used to obtain specific figures, the introduction of initial and boundary conditions, the choices of parameters and their reasons in terms of the biology.

      We will substantially expand the Appendix for the numerical solutions and add an explanatory file to the repository to make clear how the code can be run, as well as its dependencies.

      (4) The authors claimed that their framework naturally extends to multiple molecular species, but only showed the situation of labeled and unlabeled molecules across a phase boundary. How about three or more molecular species? Does this framework still work? This should be added to strengthen the manuscript and confirm the framework's general applicability. 

      We have shown in Bo et al. (2021) that the labelling approach can be carried over to multi-component systems. Each species may, for example, encounter its own interface resistance. We will discuss this in more detail in the revised manuscript.

    1. Author response:

      Notes to Editors

      We previously received comments from three reviewers at Biological Psychiatry, which we have addressed in detail below. The following is a summary of the reviewers’ comments along with our responses.

      Reviewers 1 and 2 sought clearer justification for studying the cognition-mental health overlap (covariation) and its neuroimaging correlates. In the revised manuscripts, we expanded the Introduction and Discussion to explicitly outline the theoretical implications of investigating this overlap with machine learning. We also added nuance to the interpretation of the observed associations.

      Reviewer 1 raised concerns about the accessibility of the machine learning methodology for readers without expertise in this field. We revised the Methods section to provide a clearer, step-by-step explanation of our machine learning approach, particularly the two-level machine learning through stacking. We also enhanced the description of the overall machine learning design, including model training, validation, and testing.

      In response to Reviewer 2’s request for deeper interpretation of our findings and stronger theoretical grounding, we have expanded our discussion by incorporating a thorough interpretation of how mental health indices relate to cognition, material that was previously included only in supplementary materials due to word limit constraints. We have further strengthened the theoretical justification for our study design, with particular emphasis on the importance of examining shared variance between cognition and mental health through the derivation of neural markers of cognition. Additionally, to enhance the biological interpretation of our results, we included new analyses of feature importance across neuroimaging modalities, providing clearer insights into which neural features contribute most to the observed relationships.

      Notably, Reviewer 3 acknowledged the strength of our study, including multimodal design, robust analytical approach, and clear visualization and interpretation of results. Their comments were exclusively methodological, underscoring the manuscript’s quality.

      Reviewer 1:

      The authors try to bridge mental health characteristics, global cognition and various MRI-derived (structural, diffusion and resting state fMRI) measures using the large dataset of UK Biobank. Each MRI modality alone explained max 25% of the cognitionmental health covariance, and when combined together 48% of the variance could be explained. As a peer-reviewer not familiar with the used methods (machine learning, although familiar with imaging), the manuscript is hard to read and I wonder what the message for the field might be. In the end of the discussion the authors state '... we provide potential targets for behavioural and physiological interventions that may affect cognition', the real relevance (and impact) of the findings is unclear to me.

      Thank you for your thorough review and practical recommendations. We appreciate your constructive comments and suggestions and hope our revisions adequately address your concerns.

      Major questions

      (1) The methods are hard to follow for people not in this specific subfield, and therefore, I expect that for readers it is hard to understand how valid and how useful the approach is.

      Thank you for your comment. To enhance accessibility for readers without a machine learning background, we revised the Methods section to clarify our analyses while retaining important technical details needed to understand our approach. Recognizing that some concepts may require prior knowledge, we provide detailed explanations of each analysis step, including the machine learning pipeline in the Supplementary Methods.

      Line 188: “We employed nested cross-validation to predict cognition from mental health indices and 72 neuroimaging phenotypes (Fig. 1). Nested cross-validation is a robust method for evaluating machine-learning models while tuning their hyperparameters, ensuring that performance estimates are both accurate and unbiased. Here, we used a nested cross-validation scheme with five outer folds and ten inner folds.

      We started by dividing the entire dataset into five outer folds. Each fold took a turn being held out as the outerfold test set (20% of the data), while the remaining four folds (80% of the data) were used as an outer-fold training set. Within each outer-fold training set, we performed a second layer of cross-validation – this time splitting the data into ten inner folds. These inner folds were used exclusively for hyperparameter tuning: models were trained on nine of the inner folds and validated on the remaining one, cycling through all ten combinations.

      We then selected the hyperparameter configuration that performed best across the inner-fold validation sets, as determined by the minimal mean squared error (MSE). The model was then retrained on the full outer-fold training set using this hyperparameter configuration and evaluated on the outer-fold test set, using four performance metrics: Pearson r, the coefficient of determination ( R<sup>2</sup>), the mean absolute error (MAE), and the MSE. This entire process was repeated for each of the five outer folds, ensuring that every data point is used for both training and testing, but never at the same time. We opted for five outer folds instead of ten to reduce computational demands, particularly memory and processing time, given the substantial volume of neuroimaging data involved in model training. Five outer folds led to an outer-fold test set at least n = 4 000, which should be sufficient for model evaluation. In contrast, we retained ten inner folds to ensure robust and stable hyperparameter tuning, maximising the reliability of model selection.

      To model the relationship between mental health and cognition, we employed Partial Least Squares Regression (PLSR) to predict the g-factor from 133 mental health variables. To model the relationship between neuroimaging data and cognition, we used a two-step stacking approach [15–17,61] to integrate information from 72 neuroimaging phenotypes across three MRI modalities. In the first step, we trained 72 base (first-level) PLSR models, each predicting the g-factor from a single neuroimaging phenotype. In the second step, we used the predicted values from these base models as input features for stacked models, which again predicted the g-factor. We constructed four stacked models based on the source of the base predictions: one each for dwMRI, rsMRI, sMRI, and a combined model incorporating all modalities (“dwMRI Stacked”, “rsMRI Stacked”, “sMRI Stacked”, and “All MRI Stacked”, respectively). Each stacked model was trained using one of four machine learning algorithms – ElasticNet, Random Forest, XGBoost, or Support Vector Regression – selected individually for each model (see Supplementary Materials, S6).

      For rsMRI phenotypes, we treated the choice of functional connectivity quantification method – full correlation, partial correlation, or tangent space parametrization – as a hyperparameter. The method yielding the highest performance on the outer-fold training set was selected for predicting the g-factor (see Supplementary Materials, S5).

      To prevent data leakage, we standardized the data using the mean and standard deviation derived from the training set and applied these parameters to the corresponding test set within each outer fold. This standardization was performed at three key stages: before g-factor derivation, before regressing out modality-specific confounds from the MRI data, and before stacking. Similarly, to maintain strict separation between training and testing data, both base and stacked models were trained exclusively on participants from the outer-fold training set and subsequently applied to the corresponding outer-fold test set.

      To evaluate model performance and assess statistical significance, we aggregated the predicted and observed g_factor values from each outer-fold test set. We then computed a bootstrap distribution of Pearson’s correlation coefficient (_r) by resampling with replacement 5 000 times, generating 95% confidence intervals (CIs) (Fig. 1). Model performance was considered statistically significant if the 95% CI did not include zero, indicating that the observed associations were unlikely to have occurred by chance.”

      (2) If only 40% of the cognition-mental health covariation can be explained by the MRI variables, how to explain the other 60% of the variance? And related to this %: why do the author think that 'this provides us confidence in using MRI to derive quantitative neuromarkers of cognition'?

      Thank you for this insightful observation. Using the MRI modalities available in the UK Biobank, we were able to account for 48% of the covariation between cognition and mental health. The remaining 52% of unexplained variance may arise from several sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research from our group and others has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank.

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the Research Domain Criteria (RDoC) framework, brain circuits represent only one level of neurobiological analysis relevant to cognition. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. We have now incorporated these considerations into the Discussion section.

      Line 658: “Although recent debates [18] have challenged the predictive utility of MRI for cognition, our multimodal marker integrating 72 neuroimaging phenotypes captures nearly half of the mental health-explained variance in cognition. We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.

      The remaining unexplained variance in the relationship between cognition and mental health likely stems from multiple sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank [15,17,61,69,114,142,151].

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition [14]. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. Ultimately, our findings validate brain-based neural markers as a fundamental neurobiological unit of analysis, advancing our understanding of mental health through the lens of cognition.”

      Regarding our confidence in using MRI to derive neural markers for cognition, we base this on the predictive performance of MRI-based models. As we note in the Discussion (Line 554: “Consistent with previous studies, we show that MRI data predict individual differences in cognition with a medium-size performance (r ≈ 0.4) [15–17, 28, 61, 67, 68].”), the medium effect size we observed (r ≈ 0.4) agrees with existing literature on brain-cognition relationships, confirming that machine learning leads to replicable results. This effect size represents a moderate yet meaningful association in neuroimaging studies of aging, consistent with reports linking brain to behaviour in adults (Krämer et al., 2024; Tetereva et al., 2022). For example, a recent meta-analysis by Vieira and colleagues (2022) reported a similar effect size (r = 0.42, 95% CI [0.35;0.50]). Our study includes over 15000 participants, comparable to or more than typical meta-analyses, allowing us to characterise our work as a “mega-analysis”. And on top of this predictive performance, we found our neural markers for cognition to capture half of the cognition-mental health covariation, boosting our confidence in our approach.

      Krämer C, Stumme J, da Costa Campos L, Dellani P, Rubbert C, Caspers J, et al. Prediction of cognitive performance differences in older age from multimodal neuroimaging data. GeroScience. 2024;46:283–308.

      Tetereva A, Li J, Deng JD, Stringaris A, Pat N. Capturing brain cognition relationship: Integrating task‐based fMRI across tasks markedly boosts prediction and test‐retest reliability. NeuroImage. 2022;263:119588.

      (3) Imagine that we can increase the explained variance using multimodal MRI measures, why is it useful? What does it learn us? What might be the implications?

      We assume that by variance, Reviewer 1 referred to the cognition-mental health covariation mentioned in point 2) above.

      If we can increase the explained cognition-mental health covariation using multimodal MRI measures, it would mean that we have developed a reasonable neuromarker that is close to RDoC’s neurobiological unit of analysis for cognition. RDoC treats cognition as one of the main basic functional domains that transdiagnostically underly mental health. According to RDoC, mental health should be studied in relation to cognition, alongside other domains such as negative and positive valence systems, arousal and regulatory systems, social processes, and sensorimotor functions. RDoC further emphasizes that each domain, including cognition, should be investigated not only at the behavioural level but also through its neurobiological correlates. This means RDoC aims to discover neural markers of cognition that explain the covariation between cognition and mental health. For us, we approach the development of such neural markers using multimodal neuroimaging. We have now explained the motivation of our study in the first paragraph of the Introduction.

      Line 43: “Cognition and mental health are closely intertwined [1]. Cognitive dysfunction is present in various mental illnesses, including anxiety [2, 3], depression [4–6], and psychotic disorders [7–12]. National Institute of Mental Health’s Research Domain Criteria (RDoC) [13,14] treats cognition as one of the main basic functional domains that transdiagnostically underly mental health. According to RDoC, mental health should be studied in relation to cognition, alongside other domains such as negative and positive valence systems, arousal and regulatory systems, social processes, and sensorimotor functions. RDoC further emphasizes that each domain, including cognition, should be investigated not only at the behavioural level but also through its neurobiological correlates. In this study, we aim to examine how the covariation between cognition and mental health is reflected in neural markers of cognition, as measured through multimodal neuroimaging.”

      More specific issues:

      Introduction

      (4) In the intro the sentence 'in some cases, altered cognitive functioning is directly related to psychiatric symptom severity' is in contrast to the next sentence '... are often stable and persist upon alleviation of psychiatric symptoms'.

      Thank you for pointing this out. The first sentence refers to cases where cognitive deficits fluctuate with symptom severity, while the second emphasizes that core cognitive impairments often remain stable even during symptom remission. To avoid this confusion, we have removed these sentences.

      (5) In the intro the text on the methods (various MRI modalities) is not needed for the Biol Psych readers audience.

      We appreciate your comment. While some members of our target audience may have backgrounds in neuroimaging, machine learning, or psychiatry, we recognize that not all readers will be familiar with all three areas. To ensure accessibility for those who are not familiar with neuroimaging, we included a brief overview of the MRI modalities and quantification methods used in our study to provide context for the specific neuroimaging phenotypes. Additionally, we provided background information on the machine learning techniques employed, so that readers without a strong background in machine learning can still follow our methodology.

      (6) Regarding age of the study sample: I understand that at recruitment the subjects' age ranges from 40 to 69 years. At MRI scanning the age ranges between about 46 to 82. How is that possible? And related to the age of the population: how did the authors deal with age in the analyses, since age is affecting both cognition as the brain measures?

      Thank you for noticing this. In the Methods section, we first outline the characteristics of the UK Biobank cohort, including the age at first recruitment (40-69 years). Table 1 then shows the characteristics of participant subsamples included in each analysis. Since our study used data from Instance 2 (the second in-person visit), participants were approximately 5-13 years older at scanning, resulting in the age range of 46 to 82 years. We clarified the Table 1 caption as follows:

      Line 113: “Table 1. Demographics for each subsample analysed: number, age, and sex of participants who completed all cognitive tests, mental health questionnaires, and MRI scanning”

      We acknowledge that age may influence cognitive and neuroimaging measures. In our analyses, we intentionally preserved age-related variance in brain-cognition relationships across mid and late adulthood, as regressing out age completely would artificially remove biologically meaningful associations. At the same time, we rigorously addressed the effects of age and sex through additional commonality analyses quantifying age and sex contributions to the relationship between cognition and mental health.

      As noted by Reviewer 1 and illustrated in Figure 8, age and sex shared substantial overlapping variance with both mental health and neuroimaging phenotypes in explaining cognitive outcomes. For example, in Figure 8i, age and sex together accounted for 43% of the variance in the cognition-mental health relationship:

      (2.76 + 1.03) / (2.76 + 1.03 + 3.52 + 1.45) ≈ 0.43

      Furthermore, neuromarkers from the all-MRI stacked model explained 72% of this age/sexrelated variance:

      2.76 / (2.76 + 1.03) ≈ 0.72

      This indicates that our neuromarkers captured a substantial portion of the cognition-mental health covariation that varied with age and sex, highlighting their relevance in age/sex-sensitive cognitive modeling.

      In the Methods, Results, and Discussion, we say:

      Methods

      Line 263: “To understand how demographic factors, including age and sex, contribute to this relationship, we also conducted a separate set of commonality analyses treating age, sex, age2, age×sex, and age2×sex as an additional set of explanatory variables (Fig. 1).”

      Results

      Line 445: “Age and sex shared substantial overlapping variance with both mental health and neuroimaging in explaining cognition, accounting for 43% of the variance in the cognition-mental health relationship. Multimodal neural marker of cognition based on three MRI modalities (“All MRI Stacked”) explained 72% of this age and sex-related variance (Fig. 8i–l and Table S21).”

      Discussion

      Line 660: “We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.”

      (7) Regarding the mental health variables: where characteristics with positive value (e.g. happiness and subjective wellbeing) reversely scored (compared to the negative items, such as anxiety, addition, etc)?

      We appreciate you noting this. These composite scores primarily represent standard clinical measures such as the GAD-7 anxiety scale and N-12 neuroticism scale. We did not reverse the scores to keep their directionality, therefore making interpretability consistent with the original studies the scores were derived from (e.g., Davis et al., 2020; Dutt et al., 2022). Complete descriptive statistics for all mental health indices and detailed derivation procedures are provided in the Supplementary Materials (S2). On Page 6, Supplementary Methods, we say:

      Line 92: “Composite mental health scores included the Generalized Anxiety Disorder (GAD-7), the Posttraumatic Stress Disorder (PTSD) Checklist (PCL-6), the Alcohol Use Disorders Identification Test (AUDIT), the Patient Health Questionnaire (PHQ-9) [12], the Eysenck Neuroticism (N-12), Probable Depression Status (PDS), and the Recent Depressive Symptoms (RDS-4) scores [13, 14]. To calculate the GAD-7, PCL-6, AUDIT, and PHQ-9, we used questions introduced at the online follow-up [12]. To obtain the N-12, PDS, and RDS-4 scores [14], we used data collected during the baseline assessment [13, 14].

      We subcategorized depression and GAD based on frequency, current status (ever had depression or anxiety and current status of depression or anxiety), severity, and clinical diagnosis (depression or anxiety confirmed by a healthcare practitioner). Additionally, we differentiated between different depression statuses, such as recurrent depression, depression triggered by loss, etc. Variables related to self-harm were subdivided based on whether a person has ever self-harmed with the intent to die.

      To make response scales more intuitive, we recorded responses within the well-being domain such that the lower score corresponded to a lesser extent of satisfaction (“Extremely unhappy”) and the higher score indicated a higher level of happiness (“Extremely happy”). For all questions, we assigned the median values to “Prefer not to answer” (-818 for in-person assessment and -3 for online questionnaire) and “Do not know” (-121 for in-person assessment and -1 for online questionnaire) responses. We excluded the “Work/job satisfaction” question from the mental health derivatives list because it included a “Not employed” response option, which could not be reasonably coded.

      To calculate the risk of PTSD, we used questions from the PCL-6 questionnaire. Following Davis and colleagues [12], PCL-6 scores ranged from 6 to 29. A PCL-6 score of 12 or below corresponds to a low risk of meeting the Clinician-Administered PTSD Scale diagnostic criteria. PCL-6 scores between 13 and 16 and between 17 and 25 are indicative of an increased risk and high risk of PTSD, respectively. A score of above 26 is interpreted as a very high risk of PTSD [12, 15]. PTSD status was set to positive if the PCL-6 score exceeded or was equal to 14 and encompassed stressful events instead of catastrophic trauma alone [12].

      To assess alcohol consumption, alcohol dependence, and harm associated with drinking, we calculated the sum of the ten questions from the AUDIT questionnaire [16]. We additionally subdivided the AUDIT score into the alcohol consumption score (questions 1-3, AUDIT-C) and the score reflecting problems caused by alcohol (questions 4-10, AUDIT-P) [17]. In questions 2-10 that followed the first trigger question (“Frequency of drinking alcohol”), we replaced missing values with 0 as they would correspond to a “Never” response to the first question.

      An AUDIT score cut-off of 8 suggests moderate or low-risk alcohol consumption, and scores of 8 to 15 and above 15 indicate severe/harmful and hazardous (alcohol dependence or moderate-severe alcohol use disorder) drinking, respectively [16, 18]. Subsequently, hazardous alcohol use and alcohol dependence status correspond to AUDIT scores of ≥ 8 and ≥ 15, respectively. The “Alcohol dependence ever” status was set to positive if a participant had ever been physically dependent on alcohol. To reduce skewness, we logx+1-transformed the AUDIT, AUDIT-C, and AUDIT-P scores [17].”

      Davis KAS, Coleman JRI, Adams M, Allen N, Breen G, Cullen B, et al. Mental health in UK Biobank – development, implementation and results from an online questionnaire completed by 157 366 participants: a reanalysis. BJPsych Open. 2020;6:e18.

      Dutt RK, Hannon K, Easley TO, Griffis JC, Zhang W, Bijsterbosch JD. Mental health in the UK Biobank: A roadmap to selfreport measures and neuroimaging correlates. Hum Brain Mapp. 2022;43:816–832.  

      (8) In the discussion section (page 23, line 416-421), the authors refer to specific findings that are not described in the results section > I would add these findings to the main manuscript (including the discussion / interpretation).

      We appreciate your careful reading. We agree that our original Results section did not explicitly describe the factor loadings for mental health in the PLSR model, despite discussing their implications later in the paper. We needed to include this part of the discussion in the Supplementary Materials to meet the word limit of the original submission. However, in response to your suggestion, we have now added the results regarding factor loadings to the Results section. We also moved the discussion of the association between mental health features and general cognition from the Supplementary Material to the manuscript’s Discussion.

      Results

      Line 298: “On average, information about mental health predicted the g-factor at  R<sup>2</sup><sub>mean</sub> = 0.10 and r<sub>mean</sub> \= 0.31 (95% CI [0.291, 0.315]; Fig. 2b and 2c and Supplementary Materials, S9, Table S12). The magnitude and direction of factor loadings for mental health in the PLSR model allowed us to quantify the contribution of individual mental health indices to cognition. Overall, the scores for mental distress, alcohol and cannabis use, and self-harm behaviours relate positively, and the scores for anxiety, neurological and mental health diagnoses, unusual or psychotic experiences, happiness and subjective well-being, and negative traumatic events relate negatively to cognition.”

      Discussion

      Line 492: “Factor loadings derived from the PLSR model showed that the scores for mental distress, alcohol and cannabis use, and self-harm behaviours related positively, and the scores for anxiety, neurological and mental health diagnoses, unusual or psychotic experiences, happiness and subjective well-being, and negative traumatic events related negatively to the g-factor. Positive PLSR loadings of features related to mental distress may indicate greater susceptibility to or exaggerated perception of stressful events, psychological overexcitability, and predisposition to rumination in people with higher cognition [72]. On the other hand, these findings may be specific to the UK Biobank cohort and the way the questions for this mental health category were constructed. In particular, to evaluate mental distress, the UK Biobank questionnaire asked whether an individual sought or received medical help for or suffered from mental distress. In this regard, the estimate for mental distress may be more indicative of whether an individual experiencing mental distress had an opportunity or aspiration to visit a doctor and seek professional help [73]. Thus, people with better cognitive abilities and also with a higher socioeconomic status may indeed be more likely to seek professional help.

      Limited evidence supports a positive association between self-harm behaviours and cognitive abilities, with some studies indicating higher cognitive performance as a risk factor for non-suicidal self-harm. Research shows an inverse relationship between cognitive control of emotion and suicidal behaviours that weakens over the life course [73,74]. Some studies have found a positive correlation between cognitive abilities and the risk of nonsuicidal self-harm, suicidal thoughts, and suicidal plans that may be independent of or, conversely, affected by socioeconomic status [75,76]. In our study, the magnitude of the association between self-harm behaviours and cognition was low (Fig. 2), indicating a weak relationship.

      Positive PLSR loadings of features related to alcohol and cannabis may also indicate the influence of other factors. Overall, this relationship is believed to be largely affected by age, income, education, social status, social equality, social norms, and quality of life [79–80]. For example, education level and income correlate with cognitive ability and alcohol consumption [79,81–83]. Research also links a higher probability of having tried alcohol or recreational drugs, including cannabis, to a tendency of more intelligent individuals to approach evolutionary novel stimuli [84,85]. This hypothesis is supported by studies showing that cannabis users perform better on some cognitive tasks [86]. Alternatively, frequent drinking can indicate higher social engagement, which is positively associated with cognition [87]. Young adults often drink alcohol as a social ritual in university settings to build connections with peers [88]. In older adults, drinking may accompany friends or family visits [89,90]. Mixed evidence on the link between alcohol and drug use and cognition makes it difficult to draw definite conclusions, leaving an open question about the nature of this relationship.

      Consistent with previous studies, we showed that anxiety and negative traumatic experiences were inversely associated with cognitive abilities [90–93]. Anxiety may be linked to poorer cognitive performance via reduced working memory capacity, increased focus on negative thoughts, and attentional bias to threatening stimuli that hinder the allocation of cognitive resources to a current task [94–96]. Individuals with PTSD consistently showed impaired verbal and working memory, visual attention, inhibitory function, task switching, cognitive flexibility, and cognitive control [97–100]. Exposure to traumatic events that did not reach the PTSD threshold was also linked to impaired cognition. For example, childhood trauma is associated with worse performance in processing speed, attention, and executive function tasks in adulthood, and age at a first traumatic event is predictive of the rate of executive function decline in midlife [101,102]. In the UK Biobank cohort, adverse life events have been linked to lower cognitive flexibility, partially via depression level [103].

      In agreement with our findings, cognitive deficits are often found in psychotic disorders [104,105]. We treated neurological and mental health symptoms as predictor variables and did not stratify or exclude people based on psychiatric status or symptom severity. Since no prior studies have examined isolated psychotic symptoms (e.g., recent unusual experiences, hearing unreal voices, or seeing unreal visions), we avoid speculating on how these symptoms relate to cognition in our sample.

      Finally, negative PLSR loadings of the features related to happiness and subjective well-being may be specific to the study cohort, as these findings do not agree with some previous research [107–109]. On the other hand, our results agree with the study linking excessive optimism or optimistic thinking to lower cognitive performance in memory, verbal fluency, fluid intelligence, and numerical reasoning tasks, and suggesting that pessimism or realism indicates better cognition [110]. The concept of realism/optimism as indicators of cognition is a plausible explanation for a negative association between the g-factor and friendship satisfaction, as well as a negative PLSR loading of feelings that life is meaningful, especially in older adults who tend to reflect more on the meaning of life [111]. The latter is supported by the study showing a negative association between cognitive function and the search for the meaning of life and a change in the pattern of this relationship after the age of 60 [112]. Finally, a UK Biobank study found a positive association of happiness with speed and visuospatial memory but a negative relationship with reasoning ability [113].”

      (9) In the discussion section (page 24, line 440-449), the authors give an explanation on why the diffusion measure have limited utility, but the arguments put forward also concern structural and rsfMRI measures.

      Thank you for this important observation. Indeed, the argument about voxel-averaged diffusion components (“… these metrics are less specific to the properties of individual white matter axons or bundles, and instead represent a composite of multiple diffusion components averaged within a voxel and across major fibre pathways”) could theoretically apply across other MRI modalities. We have therefore removed this point from the discussion to avoid overgeneralization. However, we maintain our central argument about the biological specificity of conventional tractography-derived diffusion metrics as their particular sensitivity to white matter microstructure (e.g., axonal integrity, myelin content) may make them better suited for detecting neuropathological changes than dynamic cognitive processes. This interpretation aligns with the mixed evidence linking these metrics to cognitive performance, despite their established utility in detecting white matter abnormalities in clinical populations (e.g., Bergamino et al., 2021; Silk et al., 2009). We clarify this distinction in the manuscript.

      Line 572: “The somewhat limited utility of diffusion metrics derived specifically from probabilistic tractography in serving as robust quantitative neuromarkers of cognition and its shared variance with mental health may stem from their greater sensitivity and specificity to neuronal integrity and white matter microstructure rather than to dynamic cognitive processes. Critically, probabilistic tractography may be less effective at capturing relationships between white matter microstructure and behavioural scores cross-sectionally, as this method is more sensitive to pathological changes or dynamic microstructural alterations like those occurring during maturation. While these indices can capture abnormal white matter microstructure in clinical populations such as Alzheimer’s disease, schizophrenia, or attention deficit hyperactivity disorder (ADHD) [117–119], the empirical evidence on their associations with cognitive performance is controversial [114, 120–126].”

      Bergamino M, Walsh RR, Stokes AM. Free-water diffusion tensor imaging improves the accuracy and sensitivity of white matter analysis in Alzheimer’s disease. Sci Rep. 2021;11:6990.

      Silk TJ, Vance A, Rinehart N, Bradshaw JL, Cunnington R. White-matter abnormalities in attention deficit hyperactivity disorder: a diffusion tensor imaging study. Hum Brain Mapp. 2009;30:2757–2765.

      Reviewer 2:

      This is an interesting study combining a lot of data to investigate the link between cognition and mental health. The description of the study is very clear, it's easy to read for someone like me who does not have a lot of expertise in machine learning.

      We thank you for your thorough review and constructive feedback. Your insightful comments have helped us identify conceptual and methodological aspects that required improvement in the manuscript. We have incorporated relevant changes throughout the paper, and below, we address each of your points in detail.

      Comment 1: My main concern with this manuscript is that it is not yet clear to me what it exactly means to look at the overlap between cognition and mental health. This relation is r=0.3 which is not that high, so why is it then necessary to explain this overlap with neuroimaging measures? And, could it be that the relation between cognition and mental health is explained by third variables (environment? opportunities?). In the introduction I miss an explanation of why it is important to study this and what it will tell us, and in the discussion I would like to read some kind of 'answer' to these questions.

      Thank you. It’s important to clarify why we investigated the relationship between cognition and mental health, and what we found using data from the UK Biobank.

      Conceptually, our work is grounded in the Research Domain Criteria (RDoC; Insel et al., 2010) framework. RDoC conceptualizes mental health not through traditional diagnostic categories, but through core functional domains that span the full spectrum from normal to abnormal functioning. These domains include cognition, negative and positive valence systems, arousal and regulatory systems, social processes, and sensorimotor functions. Within this framework, cognition is considered a fundamental domain that contributes to mental health across diagnostic boundaries. Meta-analytic evidence supports a link between cognitive functioning and mental health (Abramovitch, et al., 2021; East-Richard, et al., 2020). In the context of a large, population-based dataset like the UK Biobank, this implies that cognitive performance – as measured by various cognitive tasks – should be meaningfully associated with available mental health indicators.

      However, because cognition is only one of several functional domains implicated in mental health, we do not expect the covariation between cognition and mental health to be very high. Other domains, such as negative and positive valence systems, arousal and regulatory systems, or social processing, may also play significant roles. Theoretically, this places an upper bound on the strength of the cognition-mental health relationship, especially in normative, nonclinical samples.

      Our current findings from the UK Biobank reflect this. Most of the 133 mental health variables showed relatively weak individual correlations with cognition (mean r \= 0.01, SD = 0.05, min r \= –0.08, max r \= 0.17; see Figure 2). However, using a PLS-based machine learning approach, we were able to integrate information across all mental-health variables to predict cognition, yielding an out-of-sample correlation of r = 0.31 [95% CI: 0.29, 0.32].  

      We believe this estimate approximates the true strength of the cognition-mental health relationship in normative samples, consistent with both theoretical expectations and prior empirical findings. Theoretically, this aligns with the RDoC view that cognition is one of several contributing domains. Empirically, our results are consistent with findings from our previous mega-analysis in children (Wang et al., 2025). Moreover, in the field of gerontology, an effect size of r = 0.31 is not considered small. According to Brydges (2019), it falls around the 70th percentile of effect sizes reported in gerontological studies and approaches the threshold for a large effect (r \= 0.32). Given that most studies report within-sample associations, our out-of-sample results are likely more robust and generalizable (Yarkoni & Westfall, 2017).

      To answer, “why is it then necessary to explain this overlap with neuroimaging measures”, we again draw on the conceptual foundation of the RDoC framework. RDoC emphasizes that each functional domain, such as cognition, should be studied not only at the behavioural level but also across multiple neurobiological units of analysis, including genes, molecules, cells, circuits, physiology, and behaviour.

      MRI-based neural markers represent one such level of analysis. While other biological systems (e.g., genetic, molecular, or physiological) also contribute to the cognition-mental health relationship, neuroimaging provides unique insights into the brain mechanisms underlying this association – insights that cannot be obtained from behavioural data alone.

      In response to the related question, “Could the relationship between cognition and mental health be explained by third variables (e.g., environment, opportunities)?”, we note that developing a neural marker of cognition capable of capturing its relationship with mental health is the central aim of this study. Using the MRI modalities available in the UK Biobank, we were able to account for 48% of the covariation between cognition and mental health.

      The remaining 52% of unexplained variance may stem from several sources. According to the RDoC framework, neuromarkers could be further refined by incorporating additional neuroimaging modalities (e.g., task-based fMRI, PET, ASL, MEG/EEG, fNIRS) and integrating other units of analysis such as genetic, molecular, cellular, and physiological data.

      Once more comprehensive neuromarkers are developed, capturing a greater proportion of the cognition-mental health covariation, they may also lead to new research direction – to investigate how environmental factors and life opportunities influence these markers. However, exploring those environmental contributions lies beyond the scope of the current study.

      We discuss these considerations and explain the motivation of our study in the revised Introduction and Discussion.

      Line 481: “Our analysis confirmed the validity of the g-factor [31] as a quantitative measure of cognition [31], demonstrating that it captures nearly half (39%) of the variance across twelve cognitive performance scores, consistent with prior studies [63–68]. Furthermore, we were able to predict cognition from 133 mental health indices, showing a medium-sized relationship that aligns with existing literature [69,70]. Although the observed mental health-cognition association is lower than within-sample estimates in conventional regression models, it aligns with our prior mega-analysis in children [69]. Notably, this effect size is not considered small in gerontology. In fact, it falls around the 70th percentile of reported effects and approaches the threshold for a large effect at r = 0.32 [71]. While we focused specifically on cognition as an RDoC core domain, the strength of its relationship with mental health may be bounded by the influence of other functional domains, particularly in normative, non-clinical samples – a promising direction for future research.”

      Line 658: “Although recent debates [18] have challenged the predictive utility of MRI for cognition, our multimodal marker integrating 72 neuroimaging phenotypes captures nearly half of the mental health-explained variance in cognition. We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.

      The remaining unexplained variance in the relationship between cognition and mental health likely stems from multiple sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank [15,17,61,69,114,142,151].

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition [14]. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. Ultimately, our findings validate brain-based neural markers as a fundamental neurobiological unit of analysis, advancing our understanding of mental health through the lens of cognition.”

      Introduction

      Line 43: “Cognition and mental health are closely intertwined [1]. Cognitive dysfunction is present in various mental illnesses, including anxiety [2, 3], depression [4–6], and psychotic disorders [7–12]. National Institute of Mental Health’s Research Domain Criteria (RDoC) [13,14] treats cognition as one of the main basic functional domains that transdiagnostically underly mental health. According to RDoC, mental health should be studied in relation to cognition, alongside other domains such as negative and positive valence systems, arousal and regulatory systems, social processes, and sensorimotor functions. RDoC further emphasizes that each domain, including cognition, should be investigated not only at the behavioural level but also through its neurobiological correlates. In this study, we aim to examine how the covariation between cognition and mental health is reflected in neural markers of cognition, as measured through multimodal neuroimaging.”

      Discussion

      Line 481: “Our analysis confirmed the validity of the g-factor [31] as a quantitative measure of cognition [31], demonstrating that it captures nearly half (39%) of the variance across twelve cognitive performance scores, consistent with prior studies [63–68]. Furthermore, we were able to predict cognition from 133 mental health indices, showing a medium-sized relationship that aligns with existing literature [69,70]. Although the observed mental health-cognition association is lower than within-sample estimates in conventional regression models, it aligns with our prior mega-analysis in children [69]. Notably, this effect size is not considered small in gerontology. In fact, it falls around the 70th percentile of reported effects and approaches the threshold for a large effect at r = 0.32 [71]. While we focused specifically on cognition as an RDoC core domain, the strength of its relationship with mental health may be bounded by the influence of other functional domains, particularly in normative, non-clinical samples – a promising direction for future research.”

      Line 658: “Although recent debates [18] have challenged the predictive utility of MRI for cognition, our multimodal marker integrating 72 neuroimaging phenotypes captures nearly half of the mental health-explained variance in cognition. We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.

      The remaining unexplained variance in the relationship between cognition and mental health likely stems from multiple sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank [15,17,61,69,114,142,151].

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition [14]. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. Ultimately, our findings validate brain-based neural markers as a fundamental neurobiological unit of analysis, advancing our understanding of mental health through the lens of cognition.”

      Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, et al. Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. AJP. 2010;167:748–751.

      Abramovitch, A., Short, T., & Schweiger, A. (2021). The C Factor: Cognitive dysfunction as a transdiagnostic dimension in psychopathology. Clinical Psychology Review, 86, 102007.

      East-Richard, C., R. -Mercier, A., Nadeau, D., & Cellard, C. (2020). Transdiagnostic neurocognitive deficits in psychiatry: A review of meta-analyses. Canadian Psychology / Psychologie Canadienne, 61(3), 190–214.

      Wang Y, Anney R, Pat N. The relationship between cognitive abilities and mental health as represented by cognitive abilities at the neural and genetic levels of analysis. eLife. 2025.14:RP105537.

      Brydges CR. Effect Size Guidelines, Sample Size Calculations, and Statistical Power in Gerontology. Innovation in Aging. 2019;3(4):igz036.

      Yarkoni T, Westfall J. Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning. Perspect Psychol Sci. 2017;12(6):1100-1122.

      Comment 2 Title: - Shouldn't it be "MRI markers" (plural)?

      We used the singular form (“marker”) intentionally, as it refers to the composite neuroimaging marker derived from all three MRI modalities in our stacked model. This multimodal marker represents the combined predictive power of all modalities and captures the highest proportion of the mental health-cognition relationship in our analyses.

      Comment 3: Introduction - I miss an explanation of why it is useful to look at cognition-mental health covariation

      We believe we have sufficiently addressed this comment in our response to Reviewer 2, comment 1 above.

      Comment 4: - "Demonstrating that MRI-based neural indicators of cognition capture the covariation between cognition and mental health will thereby support the utility of such indicators for understanding the etiology of mental health" (page 4, line 56-58) - how/why?

      Previous research has largely focused on developing MRI-based neural indicators that accurately predict cognitive performance (Marek et al., 2022; Vieira et al., 2020). Building on this foundation, our findings further demonstrate that the predictive performance of a neural indicator for cognition is closely tied to its ability to explain the covariation between cognition and mental health. In other words, the robustness of a neural indicator – its capacity to capture individual differences in cognition – is strongly associated with how well it reflects the shared variance between cognition and mental health.

      This insight is particularly important within the context of the RDoC framework, which seeks to understand the etiology of mental health through functional domains (such as cognition) and their underlying neurobiological units of analysis (Insel et al., 2010). According to RDoC, for a neural indicator of cognition to be informative for mental health research, it must not only predict cognitive performance but also capture its relationship with mental health.

      Furthermore, RDoC emphasizes the integration of neurobiological measures to investigate the influence of environmental and developmental factors on mental health. In line with this, our neural indicators of cognition may serve as valuable tools in future research aimed at understanding how environmental exposures and developmental trajectories shape mental health outcomes. We discuss this in more detail in the revised Discussion.

      Line 481: “Our analysis confirmed the validity of the g-factor [31] as a quantitative measure of cognition [31], demonstrating that it captures nearly half (39%) of the variance across twelve cognitive performance scores, consistent with prior studies [63–68]. Furthermore, we were able to predict cognition from 133 mental health indices, showing a medium-sized relationship that aligns with existing literature [69,70]. Although the observed mental health-cognition association is lower than within-sample estimates in conventional regression models, it aligns with our prior mega-analysis in children [69]. Notably, this effect size is not considered small in gerontology. In fact, it falls around the 70th percentile of reported effects and approaches the threshold for a large effect at r = 0.32 [71]. While we focused specifically on cognition as an RDoC core domain, the strength of its relationship with mental health may be bounded by the influence of other functional domains, particularly in normative, non-clinical samples – a promising direction for future research.”

      Line 658: “Although recent debates [18] have challenged the predictive utility of MRI for cognition, our multimodal marker integrating 72 neuroimaging phenotypes captures nearly half of the mental health-explained variance in cognition. We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.

      The remaining unexplained variance in the relationship between cognition and mental health likely stems from multiple sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank [15,17,61,69,114,142,151].

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition [14]. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. Ultimately, our findings validate brain-based neural markers as a fundamental neurobiological unit of analysis, advancing our understanding of mental health through the lens of cognition.”

      Marek S, Tervo-Clemmens B, Calabro FJ, Montez DF, Kay BP, Hatoum AS, et al. Reproducible brain-wide association studies require thousands of individuals. Nature. 2022;603:654–660.

      Vieira S, Gong QY, Pinaya WHL, et al. Using Machine Learning and Structural Neuroimaging to Detect First Episode Psychosis: Reconsidering the Evidence. Schizophr Bull. 2020;46(1):17-26.

      Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, et al. Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. AJP. 2010;167:748–751.

      Comment 5: - The explanation about the stacking approach is not yet completely clear to me. I don't understand how the target variable can be the dependent variable in both step one and two. Or are those different variables? It would be helpful to also give an example of the target variable in line 88 on page 5

      Thank you for this excellent question. In our stacking approach, the same target variable, the g-factor, is indeed used across both modeling stages, but with a key distinction in how predictions are generated and integrated.

      In the first-level models, we trained separate Partial Least Squares Regression (PLSR) models for each of the 72 neuroimaging phenotypes, each predicting the g-factor independently. The predicted values from these 72 models were then used as input features for the second-level stacked model, which combined them to generate a final prediction of the g-factor. This twostage framework enables us to integrate information across multiple imaging modalities while maintaining a consistent prediction target.

      To avoid data leakage, both modeling stages were conducted entirely within the training set for each cross-validation fold. Only after the second-level model was trained was it applied to the outer-fold test participants who were not involved in any part of the model training process.

      To improve accessibility, we have revised the Methods section (see Page 10) to clarify this approach, ensuring that the description remains technically accurate while being easier to follow.

      Line 188: “We employed nested cross-validation to predict cognition from mental health indices and 72 neuroimaging phenotypes (Fig. 1). Nested cross-validation is a robust method for evaluating machine-learning models while tuning their hyperparameters, ensuring that performance estimates are both accurate and unbiased. Here, we used a nested cross-validation scheme with five outer folds and ten inner folds.

      We started by dividing the entire dataset into five outer folds. Each fold took a turn being held out as the outerfold test set (20% of the data), while the remaining four folds (80% of the data) were used as an outer-fold training set. Within each outer-fold training set, we performed a second layer of cross-validation – this time splitting the data into ten inner folds. These inner folds were used exclusively for hyperparameter tuning: models were trained on nine of the inner folds and validated on the remaining one, cycling through all ten combinations.

      We then selected the hyperparameter configuration that performed best across the inner-fold validation sets, as determined by the minimal mean squared error (MSE). The model was then retrained on the full outer-fold training set using this hyperparameter configuration and evaluated on the outer-fold test set, using four performance metrics: Pearson r, the coefficient of determination ( R<sup>2</sup>), the mean absolute error (MAE), and the MSE. This entire process was repeated for each of the five outer folds, ensuring that every data point is used for both training and testing, but never at the same time. We opted for five outer folds instead of ten to reduce computational demands, particularly memory and processing time, given the substantial volume of neuroimaging data involved in model training. Five outer folds led to an outer-fold test set at least n = 4 000, which should be sufficient for model evaluation. In contrast, we retained ten inner folds to ensure robust and stable hyperparameter tuning, maximising the reliability of model selection.

      To model the relationship between mental health and cognition, we employed Partial Least Squares Regression (PLSR) to predict the g-factor from 133 mental health variables. To model the relationship between neuroimaging data and cognition, we used a two-step stacking approach [15–17,61] to integrate information from 72 neuroimaging phenotypes across three MRI modalities. In the first step, we trained 72 base (first-level) PLSR models, each predicting the g-factor from a single neuroimaging phenotype. In the second step, we used the predicted values from these base models as input features for stacked models, which again predicted the g-factor. We constructed four stacked models based on the source of the base predictions: one each for dwMRI, rsMRI, sMRI, and a combined model incorporating all modalities (“dwMRI Stacked”, “rsMRI Stacked”, “sMRI Stacked”, and “All MRI Stacked”, respectively). Each stacked model was trained using one of four machine learning algorithms – ElasticNet, Random Forest, XGBoost, or Support Vector Regression – selected individually for each model (see Supplementary Materials, S6).

      For rsMRI phenotypes, we treated the choice of functional connectivity quantification method – full correlation, partial correlation, or tangent space parametrization – as a hyperparameter. The method yielding the highest performance on the outer-fold training set was selected for predicting the g-factor (see Supplementary Materials, S5).

      To prevent data leakage, we standardized the data using the mean and standard deviation derived from the training set and applied these parameters to the corresponding test set within each outer fold. This standardization was performed at three key stages: before g-factor derivation, before regressing out modality-specific confounds from the MRI data, and before stacking. Similarly, to maintain strict separation between training and testing data, both base and stacked models were trained exclusively on participants from the outer-fold training set and subsequently applied to the corresponding outer-fold test set.

      To evaluate model performance and assess statistical significance, we aggregated the predicted and observed gfactor values from each outer-fold test set. We then computed a bootstrap distribution of Pearson’s correlation coefficient (r) by resampling with replacement 5 000 times, generating 95% confidence intervals (CIs) (Fig. 1). Model performance was considered statistically significant if the 95% CI did not include zero, indicating that the observed associations were unlikely to have occurred by chance.”

      Comment 6: Methods - It's not clear from the text and Figure 1 which 12 scores from 11 tests are being used to derive the g-factor. Figure 1 shows only 8 bullet points with 10 scores in A and 13 tests under 'Cognitive tests' in B. Moreover, Supplement S1 describes 12 tests and 14 measures (Prospective Memory test is in the text but not in Supplementary Table 1).

      Thank you for identifying this discrepancy. In the original Figure 1b and in the Supplementary Methods (S1), the “Prospective Memory” test was accidentally duplicated, while it was present in the Supplementary Table 1 (Line 53, Supplementary Table 1). We have now corrected both figures for consistency. To clarify: Figure 1a presents the global mental health and cognitive domains studied, while Figure 1b now accurately lists 1) the 12 cognitive scores from 11 tests used to derive the g-factor (with the Trail Making Test contributing two measures – numeric and alphabetic trails) and 2) the three main categories of mental health indices used as machine learning features.

      We also corrected the Supplementary Materials to remove the duplicate test from the first paragraph. In Supplementary Table 1, there were 11 tests listed, and for the Trail Making test, we specified in the “Core measures” column that this test had 2 derivative scores: duration to complete the numeric path (Trail 1) and duration to complete the alphabetic path (Trail 2).

      Supplementary Materials, Line 46: “We used twelve scores from the eleven cognitive tests that represented the following cognitive domains: reaction time and processing speed (Reaction Time test), working memory (Numeric Memory test), verbal and numerical reasoning (Fluid Intelligence test), executive function (Trail Making Test), non-verbal fluid reasoning (Matrix Pattern Completion test), processing speed (Symbol Digit Substitution test), vocabulary (Picture Vocabulary test), planning abilities (Tower Rearranging test), verbal declarative memory (Paired Associate Learning test), prospective memory (Prospective Memory test), and visual memory (Pairs Matching test) [1].”

      Comment 7: - For the mental health measures: If I understand correctly, the questionnaire items were used individually, but also to create composite scores. This seems counterintuitive, because I would assume that if the raw data is used, the composite scores would not add additional information to that. When reading the Supplement, it seems like I'm not correct… It would be helpful to clarify the text on page 7 in the main text.

      You raise an excellent observation regarding the use of both individual questionnaire items and composite scores. This dual approach was methodologically justified by the properties of Partial Least Squares Regression (PLSR), our chosen first-level machine learning algorithm, which benefits from rich feature sets and can handle multicollinearity through dimensionality reduction. PLSR transforms correlated features into latent variables, meaning both individual items and composite scores can contribute unique information to the model. We elaborate on PLSR's mathematical principles in Supplementary Materials (S5).

      To directly address this concern, we conducted comparative analyses showing that the PLSR model (a single 80/20% training/test split), incorporating all 133 mental health features (both items and composites), outperformed models using either type alone. The full model achieved superior performance (MSE = 0.458, MAE = 0.537, \= 0.112, Pearson r = 0.336, p-value = 6.936e-112) compared to using only composite scores (93 features; MSE = 0.461, MAE = 0.538, R<sup>2</sup> = 0.107, Pearson r = 0.328, p-value = 5.8e-106) or only questionnaire items (40 features; MSE = 0.499, MAE = 0.561, R<sup>2</sup> = 0.033, Pearson r = 0.184, p-value = 2.53e-33). These results confirm that including both data types provide complementary predictive value. We expand on these considerations in the revised Methods section.

      Line 123: “Mental health measures encompassed 133 variables from twelve groups: mental distress, depression, clinical diagnoses related to the nervous system and mental health, mania (including bipolar disorder), neuroticism, anxiety, addictions, alcohol and cannabis use, unusual/psychotic experiences, traumatic events, selfharm behaviours, and happiness and subjective well-being (Fig. 1 and Tables S4 and S5). We included both selfreport questionnaire items from all participants and composite diagnostic scores computed following Davis et al. and Dutt et al. [35,36] as features in our first-level (for explanation, see Data analysis section) Partial Least Squares Regression (PLSR) model. This approach leverages PLSR’s ability to handle multicollinearity through dimensionality reduction, enabling simultaneous use of granular symptom-level information and robust composite measures (for mental health scoring details, see Supplementary Materials, S2). We assess the contribution of each mental health index to general cognition by examining the direction and magnitude of its PLSR-derived loadings on the identified latent variables”

      Comment 8: - Results - The colors in Figure 4 B are a bit hard to differentiate.

      We have updated Figure 4 to enhance colour differentiation by adjusting saturation and brightness levels, improving visual distinction. For further clarity, we split the original figure into two separate figures.

      Comment 9: - Discussion - "Overall, the scores for mental distress, alcohol and cannabis use, and self-harm behaviours relate positively, and the scores for anxiety, neurological and mental health diagnoses, unusual or psychotic experiences, happiness and subjective well-being, and negative traumatic events relate negatively to cognition," - this seems counterintuitive, that some symptoms relate to better cognition and others relate to worse cognition. Could you elaborate on this finding and what it could mean?

      We appreciate you highlighting this important observation. While some associations between mental health indices and cognition may appear counterintuitive at first glance, these patterns are robust (emerging consistently across both univariate correlations and PLSR loadings) and align with previous literature (e.g., Karpinski et al., 2018; Ogueji et al., 2022). For instance, the positive relationship between cognitive ability and certain mental health indicators like help-seeking behaviour has been documented in other population studies (Karpinski et al., 2018; Ogueji et al., 2022), potentially reflecting greater health literacy and access to care among cognitively advantaged individuals. Conversely, the negative associations with conditions like psychotic experiences mirror established neurocognitive deficits in these domains.

      As was initially detailed in Supplementary Materials (S12) and now expanded in our Discussion, these findings likely reflect complex multidimensional interactions. The positive loadings for mental distress indicators may capture: (1) greater help-seeking behaviour among those with higher cognition and socioeconomic resources, and/or (2) psychological overexcitability and rumination tendencies in high-functioning individuals. These interpretations are particularly relevant to the UK Biobank's assessment methods, where mental distress items focused on medical help-seeking rather than symptom severity per se (e.g., as a measure of mental distress, the UK Biobank questionnaire asked whether an individual sought or received medical help for or suffered from mental distress).

      Line 492: “Factor loadings derived from the PLSR model showed that the scores for mental distress, alcohol and cannabis use, and self-harm behaviours related positively, and the scores for anxiety, neurological and mental health diagnoses, unusual or psychotic experiences, happiness and subjective well-being, and negative traumatic events related negatively to the g-factor. Positive PLSR loadings of features related to mental distress may indicate greater susceptibility to or exaggerated perception of stressful events, psychological overexcitability, and predisposition to rumination in people with higher cognition [72]. On the other hand, these findings may be specific to the UK Biobank cohort and the way the questions for this mental health category were constructed. In particular, to evaluate mental distress, the UK Biobank questionnaire asked whether an individual sought or received medical help for or suffered from mental distress. In this regard, the estimate for mental distress may be more indicative of whether an individual experiencing mental distress had an opportunity or aspiration to visit a doctor and seek professional help [73]. Thus, people with better cognitive abilities and also with a higher socioeconomic status may indeed be more likely to seek professional help.

      Limited evidence supports a positive association between self-harm behaviours and cognitive abilities, with some studies indicating higher cognitive performance as a risk factor for non-suicidal self-harm. Research shows an inverse relationship between cognitive control of emotion and suicidal behaviours that weakens over the life course [73,74]. Some studies have found a positive correlation between cognitive abilities and the risk of nonsuicidal self-harm, suicidal thoughts, and suicidal plans that may be independent of or, conversely, affected by socioeconomic status [75,76]. In our study, the magnitude of the association between self-harm behaviours and cognition was low (Fig. 2), indicating a weak relationship.

      Positive PLSR loadings of features related to alcohol and cannabis may also indicate the influence of other factors. Overall, this relationship is believed to be largely affected by age, income, education, social status, social equality, social norms, and quality of life [79–80]. For example, education level and income correlate with cognitive ability and alcohol consumption [79,81–83]. Research also links a higher probability of having tried alcohol or recreational drugs, including cannabis, to a tendency of more intelligent individuals to approach evolutionary novel stimuli [84,85]. This hypothesis is supported by studies showing that cannabis users perform better on some cognitive tasks [86]. Alternatively, frequent drinking can indicate higher social engagement, which is positively associated with cognition [87]. Young adults often drink alcohol as a social ritual in university settings to build connections with peers [88]. In older adults, drinking may accompany friends or family visits [89,90]. Mixed evidence on the link between alcohol and drug use and cognition makes it difficult to draw definite conclusions, leaving an open question about the nature of this relationship.

      Consistent with previous studies, we showed that anxiety and negative traumatic experiences were inversely associated with cognitive abilities [90–93]. Anxiety may be linked to poorer cognitive performance via reduced working memory capacity, increased focus on negative thoughts, and attentional bias to threatening stimuli that hinder the allocation of cognitive resources to a current task [94–96]. Individuals with PTSD consistently showed impaired verbal and working memory, visual attention, inhibitory function, task switching, cognitive flexibility, and cognitive control [97–100]. Exposure to traumatic events that did not reach the PTSD threshold was also linked to impaired cognition. For example, childhood trauma is associated with worse performance in processing speed, attention, and executive function tasks in adulthood, and age at a first traumatic event is predictive of the rate of executive function decline in midlife [101,102]. In the UK Biobank cohort, adverse life events have been linked to lower cognitive flexibility, partially via depression level [103].

      In agreement with our findings, cognitive deficits are often found in psychotic disorders [104,105]. We treated neurological and mental health symptoms as predictor variables and did not stratify or exclude people based on psychiatric status or symptom severity. Since no prior studies have examined isolated psychotic symptoms (e.g., recent unusual experiences, hearing unreal voices, or seeing unreal visions), we avoid speculating on how these symptoms relate to cognition in our sample.

      Finally, negative PLSR loadings of the features related to happiness and subjective well-being may be specific to the study cohort, as these findings do not agree with some previous research [107–109]. On the other hand, our results agree with the study linking excessive optimism or optimistic thinking to lower cognitive performance in memory, verbal fluency, fluid intelligence, and numerical reasoning tasks, and suggesting that pessimism or realism indicates better cognition [110]. The concept of realism/optimism as indicators of cognition is a plausible explanation for a negative association between the g-factor and friendship satisfaction, as well as a negative PLSR loading of feelings that life is meaningful, especially in older adults who tend to reflect more on the meaning of life [111]. The latter is supported by the study showing a negative association between cognitive function and the search for the meaning of life and a change in the pattern of this relationship after the age of 60 [112]. Finally, a UK Biobank study found a positive association of happiness with speed and visuospatial memory but a negative relationship with reasoning ability [113].”

      Karpinski RI, Kinase Kolb AM, Tetreault NA, Borowski TB. High intelligence: A risk factor for psychological and physiological overexcitabilities. Intelligence. 2018;66:8–23.

      Ogueji IA, Okoloba MM. Seeking Professional Help for Mental Illness: A Mixed-Methods Study of Black Family Members in the UK and Nigeria. Psychol Stud. 2022;67:164–177.

      Comment 10: - All neuroimaging factors together explain 48% of the variance in the cognition-mental health relationship. However, this relationship is only r=0.3 - so then the effect of neuroimaging factors seems a lot smaller… What does it mean?

      Thank you for raising this critical point. We have addressed this point in our response to Reviewer 1, comment 2, Reviewer 1, comment 3 and Reviewer 2, comment 1.

      Briefly, cognition is related to mental health at around r = 0.3 and to neuroimaging phenotypes at around r = 0.4. These levels of relationship strength are consistent to what has been shown in the literature (e.g., Wang et al., 2025 and Vieira et al., 2020). We discussed the relationship between cognition and mental health in our response to Reviewer 2, comment 1 above. In short, this relationship reflects just one functional domain – mental health may also be associated with other domains such as negative and positive valence systems, arousal and regulatory systems, social processes, and sensorimotor functions. Moreover, in the context of gerontology research, this effect size is considered relatively large (Brydges et al., 2019).

      We conducted a commonality analysis to investigate the unique and shared variance of mental health and neuroimaging phenotypes in explaining cognition.  As we discussed in our response to Reviewer 1, comment 2, we were able to account for 48% of the covariation between cognition and mental health using the MRI modalities available in the UK Biobank. The remaining 52% of unexplained variance may arise from several sources.

      One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research from our group and others has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank (Tetereva et al., 2025).

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      We have now incorporated these considerations into the Discussion section.

      Line 481: “Our analysis confirmed the validity of the g-factor [31] as a quantitative measure of cognition [31], demonstrating that it captures nearly half (39%) of the variance across twelve cognitive performance scores, consistent with prior studies [63–68]. Furthermore, we were able to predict cognition from 133 mental health indices, showing a medium-sized relationship that aligns with existing literature [69,70]. Although the observed mental health-cognition association is lower than within-sample estimates in conventional regression models, it aligns with our prior mega-analysis in children [69]. Notably, this effect size is not considered small in gerontology. In fact, it falls around the 70th percentile of reported effects and approaches the threshold for a large effect at r = 0.32 [71]. While we focused specifically on cognition as an RDoC core domain, the strength of its relationship with mental health may be bounded by the influence of other functional domains, particularly in normative, non-clinical samples – a promising direction for future research.”

      Line 658: “Although recent debates [18] have challenged the predictive utility of MRI for cognition, our multimodal marker integrating 72 neuroimaging phenotypes captures nearly half of the mental health-explained variance in cognition. We demonstrate that neural markers with greater predictive accuracy for cognition also better explain cognition-mental health covariation, showing that multimodal MRI can capture both a substantial cognitive variance and nearly half of its shared variance with mental health. Finally, we show that our neuromarkers explain a substantial portion of the age- and sex-related variance in the cognition-mental health relationship, highlighting their relevance in modeling cognition across demographic strata.

      The remaining unexplained variance in the relationship between cognition and mental health likely stems from multiple sources. One possibility is the absence of certain neuroimaging modalities in the UK Biobank dataset, such as task-based fMRI contrasts, positron emission tomography, arterial spin labeling, and magnetoencephalography/electroencephalography. Prior research has consistently demonstrated strong predictive performance from specific task-based fMRI contrasts, particularly those derived from tasks like the n-Back working memory task and the face-name episodic memory task, none of which is available in the UK Biobank [15,17,61,69,114,142,151].

      Moreover, there are inherent limitations in using MRI as a proxy for brain structure and function. Measurement error and intra-individual variability, such as differences in a cognitive state between cognitive assessments and MRI acquisition, may also contribute to the unexplained variance. According to the RDoC framework, brain circuits represent only one level of neurobiological analysis relevant to cognition [14]. Other levels, including genes, molecules, cells, and physiological processes, may also play a role in the cognition-mental health relationship.

      Nonetheless, neuroimaging provides a valuable window into the biological mechanisms underlying this overlap – insights that cannot be gleaned from behavioural data alone. Ultimately, our findings validate brain-based neural markers as a fundamental neurobiological unit of analysis, advancing our understanding of mental health through the lens of cognition.”

      Wang Y, Anney R, Pat N. The relationship between cognitive abilities and mental health as represented by cognitive abilities at the neural and genetic levels of analysis. eLife. 2025.14:RP105537.

      Vieira S, Gong QY, Pinaya WHL, et al. Using Machine Learning and Structural Neuroimaging to Detect First Episode Psychosis: Reconsidering the Evidence. Schizophr Bull. 2020;46(1):17-26.

      Brydges CR. Effect Size Guidelines, Sample Size Calculations, and Statistical Power in Gerontology. Innovation in Aging. 2019;3(4):igz036.

      Tetereva A, Knodt AR, Melzer TR, et al. Improving Predictability, Reliability and Generalisability of Brain-Wide Associations for Cognitive Abilities via Multimodal Stacking. Preprint. bioRxiv. 2025;2024.05.03.589404.

      Reviewer 3:

      Buianova et al. present a comprehensive analysis examining the predictive value of multimodal neuroimaging data for general cognitive ability, operationalized as a derived g-factor. The study demonstrates that functional MRI holds the strongest predictive power among the modalities, while integrating multiple MRI modalities through stacking further enhances prediction performance. The inclusion of a commonality analysis provides valuable insight into the extent to which shared and unique variance across mental health features and neuroimaging modalities contributes to the observed associations with cognition. The results are clearly presented and supported by highquality visualizations. Limitations of the sample are stated clearly.

      Thank you once more for your constructive and encouraging feedback. We appreciate your careful reading and valuable methodological insights. Your expertise has helped us clarify key methodological concepts and improve the overall rigour of our study.

      Suggestions for improvement:

      (1) The manuscript would benefit from the inclusion of permutation testing to evaluate the statistical significance of the predictive models. This is particularly important given that some of the reported performance metrics are relatively modest, and permutation testing could help ensure that results are not driven by chance.

      Thank you, this is an excellent point. We agree that evaluating the statistical significance of our predictive models is essential.

      In our original analysis, we assessed model performance by generating a bootstrap distribution of Pearson’s r, resampling the data with replacement 5,000 times (see Figure 3b). In response to your feedback, we have made the following updates:

      (1) Improved Figure 3b to explicitly display the 95% confidence intervals.

      (2) Supplemented the results by reporting the exact confidence interval values.

      (3) Clarified our significance testing procedure in the Methods section.

      We considered model performance statistically significant when the 95% confidence interval did not include zero, indicating that the observed associations are unlikely to have occurred by chance.

      We chose bootstrapping over permutation testing because, while both can assess statistical significance, bootstrapping additionally provides uncertainty estimates in the form of confidence intervals. Given the large sample size in our study, significance testing can be less informative, as even small effects may reach statistical significance. Bootstrapping offers a more nuanced understanding of model uncertainty.

      Line 233: “To evaluate model performance and assess statistical significance, we aggregated the predicted and observed g-factor values from each outer-fold test set. We then computed a bootstrap distribution of Pearson’s correlation coefficient (r) by resampling with replacement 5 000 times, generating 95% confidence intervals (CIs) (Fig. 1). Model performance was considered statistically significant if the 95% CI did not include zero, indicating that the observed associations were unlikely to have occurred by chance.”

      (2) Applying and testing the trained models on an external validation set would increase confidence in generalisability of the model.

      We appreciate this excellent suggestion. While we considered this approach, implementing it would require identifying an appropriate external dataset with comparable neuroimaging and behavioural measures, along with careful matching of acquisition protocols and variable definitions across sites. These challenges extend beyond the scope of the current study, though we fully agree that this represents an important direction for future research.

      Our findings, obtained from one of the largest neuroimaging datasets to date with training and test samples exceeding most previous studies, align closely with existing literature: the predictive accuracy of each neuroimaging phenotype and modality for cognition matches the effect size reported in meta-analyses (r ≈ 0.4; e.g., Vieira et al., 2020). The ability of dwMRI, rsMRI and sMRI to capture the cognition-mental health relationship is, in turn, consistent with our previous work in pediatric populations (Wang et al., 2025; Pat et al., 2022).

      Vieira S, Gong QY, Pinaya WHL, et al. Using Machine Learning and Structural Neuroimaging to Detect First Episode Psychosis: Reconsidering the Evidence. Schizophr Bull. 2020;46(1):17-26.

      Wang Y, Anney R, Pat N. The relationship between cognitive abilities and mental health as represented by cognitive abilities at the neural and genetic levels of analysis. eLife. 2025.14:RP105537.

      Pat N, Wang Y, Anney R, Riglin L, Thapar A, Stringaris A. Longitudinally stable, brain-based predictive models mediate the relationships between childhood cognition and socio-demographic, psychological and genetic factors. Hum Brain Mapp. 2022;43:5520–5542.

      (3) The rationale for selecting a 5-by-10-fold cross-validation scheme is not clearly explained. Clarifying why this structure was preferred over more commonly used alternatives, such as 10-by-10 or 5-by-5 cross-validation, would strengthen the methodological transparency.

      Thank you for this important methodological question. Our choice of a 5-by-10-fold crossvalidation scheme was motivated by the need to balance robust hyperparameter tuning with computational efficiency, particularly memory and processing time. Retaining five outer folds allowed us to rigorously assess model performance across multiple data partitions, leading to an outer-fold test set at least n = 4 000 and providing a substantial amount of neuroimaging data involved in model training. In contrast, employing ten inner folds ensured robust and stable hyperparameter tuning that maximizes the reliability of model selection. Thus, the 5-outer-fold with our large sample provided sufficient out-of-sample test set size for reliable model evaluation and efficient computation, while 10 inner folds enabled robust hyperparameter tuning. We now provide additional rationale for this design decision on Page 10.

      Line 188: “We employed nested cross-validation to predict cognition from mental health indices and 72 neuroimaging phenotypes (Fig. 1). Nested cross-validation is a robust method for evaluating machine-learning models while tuning their hyperparameters, ensuring that performance estimates are both accurate and unbiased. Here, we used a nested cross-validation scheme with five outer folds and ten inner folds.

      We started by dividing the entire dataset into five outer folds. Each fold took a turn being held out as the outerfold test set (20% of the data), while the remaining four folds (80% of the data) were used as an outer-fold training set. Within each outer-fold training set, we performed a second layer of cross-validation – this time splitting the data into ten inner folds. These inner folds were used exclusively for hyperparameter tuning: models were trained on nine of the inner folds and validated on the remaining one, cycling through all ten combinations.

      We then selected the hyperparameter configuration that performed best across the inner-fold validation sets, as determined by the minimal mean squared error (MSE). The model was then retrained on the full outer-fold training set using this hyperparameter configuration and evaluated on the outer-fold test set, using four performance metrics: Pearson r, the coefficient of determination ( R<sup>2</sup>), the mean absolute error (MAE), and the MSE. This entire process was repeated for each of the five outer folds, ensuring that every data point is used for both training and testing, but never at the same time. We opted for five outer folds instead of ten to reduce computational demands, particularly memory and processing time, given the substantial volume of neuroimaging data involved in model training. Five outer folds led to an outer-fold test set at least n = 4 000, which should be sufficient for model evaluation. In contrast, we retained ten inner folds to ensure robust and stable hyperparameter tuning, maximising the reliability of model selection.”

      (4) A more detailed discussion of which specific brain regions or features within each neuroimaging modality contributed most strongly to the prediction of cognition would enhance neurobiological relevance of the findings.

      Thank you for this thoughtful suggestion. To address this point, we have included feature importance plots for the top-performing neuroimaging phenotypes within each modality (Figure 5 and Figures S2–S4), demonstrating the relative contributions of individual features to the predictive models. While we maintain our primary focus on cross-modality performance comparisons in the main text, as this aligns with our central aim of evaluating multimodal MRI markers at the integrated level, we outline the contribution of neuroimaging features with the highest predictive performance for cognition in the revised Results and Discussion.

      Methods

      Line 255: “To determine which neuroimaging features contribute most to the predictive performance of topperforming phenotypes within each modality, while accounting for the potential latent components derived from neuroimaging, we assessed feature importance using the Haufe transformation [62]. Specifically, we calculated Pearson correlations between the predicted g-factor and scaled and centred neuroimaging features across five outer-fold test sets. We also examined whether the performance of neuroimaging phenotypes in predicting cognition per se is related to their ability to explain the link between cognition and mental health. Here, we computed the correlation between the predictive performance of each neuroimaging phenotype and the proportion of the cognition-mental health relationship it captures. To understand how demographic factors, including age and sex, contribute to this relationship, we also conducted a separate set of commonality analyses treating age, sex, age<sup>2</sup>, age×sex, and age<sup>2</sup>×sex as an additional set of explanatory variables (Fig. 1).”

      Results

      dwMRI

      Line 331: “Overall, models based on structural connectivity metrics performed better than TBSS and probabilistic tractography (Fig. 3). TBSS, in turn, performed better than probabilistic tractography (Fig. 3 and Table S13). The number of streamlines connecting brain areas parcellated with aparc MSA-I had the best predictive performance among all dwMRI neuroimaging phenotypes (R<sup>2</sup><sub>mean</sub> = 0.052, r<sub>mean</sub> = 0.227, 95% CI [0.212, 0.235]). To identify features driving predictions, we correlated streamline counts in aparc MSA-I parcellation with the predicted g_factor values from the PLSR model. Positive associations with the predicted _g-factor were strongest for left superior parietal-left caudal anterior cingulate, left caudate-right amygdala, and left putamen-left hippocampus connections. The most marked negative correlations involved left putamen-right posterior thalamus and right pars opercularis-right caudal anterior cingulate pathways (Fig. 5 and Supplementary Fig. S2).”

      rsMRI

      Line 353: “Among RSFC metrics for 55 and 21 ICs, tangent parameterization matrices yielded the highest performance in the training set compared to full and partial correlation, as indicated by the cross-validation score. Functional connections between the limbic (IC10) and dorsal attention (IC18) networks, as well as between the ventral attention (IC15) and default mode (IC11) networks, displayed the highest positive association with cognition. In contrast, functional connectivity between the limbic (IC43, the highest activation within network) and default mode (IC11) and limbic (IC45) and frontoparietal (IC40) networks, between the dorsal attention (IC18) and frontoparietal (IC25) networks, and between the ventral attention (IC15) and frontoparietal (IC40) networks, showed the highest negative association with cognition (Fig. 5 and Supplementary Fig. S3 and S4)”

      sMRI

      Line 373: “FreeSurfer subcortical volumetric subsegmentation and ASEG had the highest performance among all sMRI neuroimaging phenotypes (R<sup>2</sup><sub>mean</sub> = 0.068, r<sub>mean</sub> = 0.244, 95% CI [0.237, 0.259] and R<sup>2</sup><sub>mean</sub> = 0.059, r<sub>mean</sub> = 0.235, 95% CI [0.221, 0.243], respectively). In FreeSurfer subcortical volumetric subsegmentation, volumes of all subcortical structures, except for left and right hippocampal fissures, showed positive associations with cognition. The strongest relations were observed for the volumes of bilateral whole hippocampal head and whole hippocampus (Fig. 5 and Supplementary Fig. S5 for feature importance maps). Grey matter morphological characteristics from ex vivo Brodmann Area Maps showed the lowest predictive performance (R<sup>2</sup><sub>mean</sub> = 0.008, r<sub>mean</sub> = 0.089, 95% CI [0.075, 0.098]; Fig. 3 and Table S15).”

      Discussion

      dwMRI

      Line 562: “Among dwMRI-derived neuroimaging phenotypes, models based on structural connectivity between brain areas parcellated with aparc MSA-I (streamline count), particularly connections with bilateral caudal anterior cingulate (left superior parietal-left caudal anterior cingulate, right pars opercularis-right caudal anterior cingulate), left putamen (left putamen-left hippocampus, left putamen-right posterior thalamus), and amygdala (left caudate-right amygdala), result in a neural indicator that best reflects microstructural resources associated with cognition, as indicated by predictive modeling, and more importantly, shares the highest proportion of the variance with mental health-g, as indicated by commonality analysis.”

      rsMRI

      Line 583: “We extend findings on the superior performance of rsMRI in predicting cognition, which aligns with the literature [15, 28], by showing that it also explains almost a third of the variance in cognition that mental health captures. At the rsMRI neuroimaging phenotype level, this performance is mostly driven by RSFC patterns among 55 ICA-derived networks quantified using tangent space parameterization. At a feature level, these associations are best captured by the strength of functional connections among limbic, dorsal attention and ventral attention, frontoparietal and default mode networks. These functional networks have been consistently linked to cognitive processes in prior research [127–130].”

      sMRI

      Line 608: “Integrating information about brain anatomy by stacking sMRI neuroimaging phenotypes allowed us to explain a third of the link between cognition and mental health. Among all sMRI neuroimaging phenotypes, those that quantified the morphology of subcortical structures, particularly volumes of bilateral hippocampus and hippocampal head, explain the highest portion of the variance in cognition captured by mental health. Our findings show that, at least in older adults, volumetric properties of subcortical structures are not only more predictive of individual variations in cognition but also explain a greater portion of cognitive variance shared with mental health than structural characteristics of more distributed cortical grey and white matter. This aligns with the Scaffolding Theory that proposes stronger compensatory engagement of subcortical structures in cognitive processing in older adults [138–140].”

      (5) The formatting of some figure legends could be improved for clarity - for example, some subheadings were not formatted in bold (e.g., Figure 2 c)

      Thank you for noticing this. We have updated the figures to enhance clarity, keeping subheadings plain while bolding figure numbers and MRI modality names.

    1. Reviewer #2 (Public review):

      Summary:

      Egawa et al describe the developmental timeline of the assembly of nodes of Ranvier in the chick brainstem auditory circuit. In this unique system, the spacing between nodes varies significantly in different regions of the same axon from early stages, which the authors suggest is critical for accurate sound localization. Egawa et al set out to determine which factors regulate this differential node spacing. They do this by using immunohistological analyses to test the correlation of node spacing with morphological properties of the axons, and properties of oligodendrocytes, glial cells that wrap axons with the myelin sheaths that flank the nodes of Ranvier. They find that axonal structure does not vary significantly, but that oligodendrocyte density and morphology varies in the different regions traversed by these axons, which suggests this is a key determinant of the region-specific differences in node density and myelin sheath length. They also find that differential oligodendrocyte density is partly determined by secreted neuronal signals, as (presumed) blockage of vesicle fusion with tetanus toxin reduced oligodendrocyte density in the region where it is normally higher. Based on these findings, the authors propose that oligodendrocyte morphology, myelin sheath length, and consequently nodal distribution are primarily determined by intrinsic oligodendrocyte properties rather than neuronal factors such as activity.

      Major comments:

      (1) The authors should test the efficiency of TeNT to validate that vesicular release is indeed inhibited from expressing neurons. Additionally, the authors should clarify if their TeNT expression system results in the whole tract being silenced, or results in sparse vesicular release inhibition in only a few neurons.

      (2) The authors should revise their statistical analyses throughout, and supply additional information to explain the rationale for the statistical tests used, including e.g. data normality, paired sampling, number of samples/independent biological replicates.

      (3) The main finding of the study is that the density of nodes differs between two regions of the chicken auditory circuit, probably due to morphological differences in the respective oligodendrocytes. Can the authors discuss if this finding is likely to be specific to the avian auditory circuit?

      (4) The study shows a correlation between node spacing and oligodendrocyte density, but the authors did not manipulate oligodendrocyte density per se (i.e. cell-autonomously). The authors should either include such experiments, or discuss their value in supporting the interpretation of their results.

      (5) The authors should discuss very pertinent prior studies, in particular to contextualize their findings with (a) known neuron-autonomous modes of node formation prior to myelination, (b) known effects of vesicular fusion directly on myelinating capacity and oligodendrogenesis, (c) known correlation of myelin length and thickness with axonal diameter, (d) regional heterogeneity in the oligodendrocyte transcriptome.

      Significance:

      In our view the study tackles a fundamental question likely to be of interest to a specialized audience of cellular neuroscientists. This descriptive study is suggestive that in the studied system, oligodendrocyte density determines the spacing between nodes of Ranvier, but further manipulations of oligodendrocyte density per se are needed to test this convincingly.

    2. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1:

      Evidence, reproducibility and clarity

      The manuscript by Egawa and colleagues investigates differences in nodal spacing in an avian auditory brain stem circuit. The results are clearly presented and data are of very high quality. The authors make two main conclusions:

      (1) Node spacing, i.e. internodal length, is intrinsically specified by the oligodendrocytes in the region they are found in, rather than axonal properties (branching or diameter).

      (2) Activity is necessary (we don't know what kind of signaling) for normal numbers of oligodendrocytes and therefore the extent of myelination.

      These are interesting observations, albeit phenomenon. I have only a few criticisms that should be addressed:

      (1) The use of the term 'distribution' when describing the location of nodes is confusing. I think the authors mean rather than the patterns of nodal distribution, the pattern of nodal spacing. They have investigated spacing along the axon. I encourage the authors to substitute node spacing or internodal length for node distribution.

      Thanks for your suggestion to avoid confusion. We used the phrase "nodal spacing" instead of "nodal distribution" throughout the revised manuscript.

      (2) In Seidl et al. (J Neurosci 2010) it was reported that axon diameter and internodal length (nodal spacing) were different for regions of the circuit. Can the authors help me better understand the difference between the Seidl results and those presented here?

      As a key distinction, our study focuses specifically on the main trunk of the contralateral projection of NM axons. This projection features a sequential branching structure known as the delay line, where collateral branches form terminal arbors and connect to the ventral dendritic layer of NL neurons. This structural organization plays a critical role in influencing the dynamic range of ITD detection by regulating conduction delays along the NM axon trunk.

      The study by Seidl et al. (2010) is a pioneering work that measured diameter of NM axon using electron microscopy, providing highly reliable data. However, due to the technical  limitations of electron microscopy, which does not allow for the continuous tracing of individual axons, it is not entirely clear whether the axons measured in the ventral NL region correspond to terminal arbors of collateral branches or the main trunk of NM axons (see Figure 9E, F in their paper). Instead, they categorized axon diameters based on their distance from NL cell layer, showing that axon diameter increases distally (see Figure 9G in their paper). Notably, the diameters of ventral axons located more than 120 μm away from the NL cell layer is almost identical to those in the midline.

      As illustrated in our Figure 4D and Supplementary Video 2, the main trunk of the contralateral NM projection is predominantly located in these distal regions. Therefore, our findings complement those of Seidl et al. (2010) rather than contradicting them. We made this point as clear as possible in text (page 7, line 3).

      (3) The authors looked only in very young animals - are the results reported here applicable only to development, or does additional refinement take place with aging?

      In this study, we examined chick embryos from E9 to just before hatching (E21) and post-hatch chicks up to P9. Chickens begin to perceive sound around E12 and possess sound localization abilities at the time of hatching (Grier et al., 1967) (added to page 4, line 9). Therefore, by E21, the sound localization circuit is largely established.

      On the other hand, additional refinement of the circuit with aging is certainly possible. A key cue for sound localization, interaural time difference (ITD), depends on the distance between the two ears, which increases as the animal grows. As shown in Figure 2G, internodal length increased by approximately 20% between E18 and P9 while maintaining regional differences. Given that NM axons are nearly fully myelinated by E21 (Figure 4D, 6C), this suggests that myelin extends in proportion to the overall growth of the head and brain volume. We described this possibility in text (page 5, line 21)

      Thus, our study covers not only the early stages of myelination but also the post-functional maturation in the sound localization circuit.

      (4) The fact that internodal length is specified by the oligodendrocyte suggests that activity may not modify the location of nodes of Ranvier - although again, the authors have only looked during early development. This is quite different than this reviewer's original thoughts - that activity altered internodal length and axon diameter. Thus, the results here argue against node plasticity. The authors may choose to highlight this point or argue for or against it based on results in adult birds?

      In this study, we demonstrated that although vesicular release did not affect internodal length, it selectively promoted oligodendrogenesis, thereby supporting the full myelination and hence the pattern of nodal spacing along the NM axons. We believe that this finding falls within the broader scope of 'activity-dependent plasticity' involving oligodendrocytes and nodes.

      As summarized in the excellent review by Bonetto et al. (2021), activity-dependent plasticity in oligodendrocytes encompasses a wide range of phenomena, not limited to changes in internodal length but also including oligodendrogenesis. Moreover, the effects of neuronal activity are not uniform but likely depend on the diversity of both neurons and oligodendrocytes. For example, in the mouse visual cortex, activity-dependent myelination occurs in interneurons but not in excitatory neurons (Yang et al., 2020). Additionally, expression of TeNT in axons affected myelination heterogeneously in zebrafish; some axons were impaired in myelination and the others were not affected at all (Koudelka et al., 2016). In the mouse corpus callosum, neuronal activity influences oligodendrogenesis, which in turn facilitates adaptive myelination (Gibson et al., 2014).

      Thus, rather than refuting the role of activity-dependent plasticity in nodal spacing, our findings emphasize the diversity of underlying regulatory mechanisms. We described these explicitly in text (page 10, line 18).

      Significance

      This paper may argue against node plasticity as a mechanism for tuning of neural circuits. Myelin plasticity is a very hot topic right now and node plasticity reflects myelin plasticity. this seems to be a circuit where perhaps plasticity is NOT occurring. That would be interesting to test directly. One limitation is that this is limited to development.

      This paper does not argue against node plasticity, but rather demonstrates that oligodendrocytes in the NL region exhibit a form of plasticity; they proliferate in response to vesicular release from NM axons, yet do not undergo morphological changes, ensuring adequate oligodendrocyte density for the full myelination of the auditory circuit. Thus, activity-dependent plasticity involving oligodendrocytes would contributes in various ways to each neural circuit, which is presumably attributed to the fact that myelination is driven by complex multicellular interactions between diverse axons and oligodendrocytes. Oligodendrocytes are known to exhibit heterogeneity in morphology, function, responsiveness, and gene profiles (Foerster et al., 2019; Sherafat et al., 2021; Osanai et al., 2022; Valihrach et al., 2022), but functional significance of this heterogeneity remains largely unclear. This paper also provides insight into how oligodendrocyte heterogeneity may contribute to the fine-tuning of neural circuit function, adding further value to our findings. Importantly, our study covers the wide range of development in the sound localization circuit, from the pre-myelination (E9) to the postfunctional maturation (P9), revealing how the nodal spacing pattern along the axon in this circuit emerges and matures.

      Reviewer #2:

      Evidence, reproducibility and clarity

      Egawa et al describe the developmental timeline of the assembly of nodes of Ranvier in the chick brainstem auditory circuit. In this unique system, the spacing between nodes varies significantly in different regions of the same axon from early stages, which the authors suggest is critical for accurate sound localization. Egawa et al set out to determine which factors regulate this differential node spacing. They do this by using immunohistological analyses to test the correlation of node spacing with morphological properties of the axons, and properties of oligodendrocytes, glial cells that wrap axons with the myelin sheaths that flank the nodes of Ranvier. They find that axonal structure does not vary significantly, but that oligodendrocyte density and morphology varies in the different regions traversed by these axons, which suggests this is a key determinant of the region-specific differences in node density and myelin sheath length. They also find that differential oligodendrocyte density is partly determined by secreted neuronal signals, as (presumed) blockage of vesicle fusion with tetanus toxin reduced oligodendrocyte density in the region where it is normally higher. Based on these findings, the authors propose that oligodendrocyte morphology, myelin sheath length, and consequently nodal distribution are primarily determined by intrinsic oligodendrocyte properties rather than neuronal factors such as activity.

      Major points, detailed below, need to be addressed to overcome some limitations of the study.

      Major comments:

      (1) It is essential that the authors validate the efficiency of TeNT to prove that vesicular release is indeed inhibited, to be able to make any claims about the effect of vesicular release on oligodendrogenesis/myelination.

      eTeNT is a widely used genetically encoded silencing tool and constructs similar to the one used in this study have been successfully applied in primates and rodents to suppress target behaviors via genetic dissection of specific pathways (Kinoshita et al., 2012; Sooksawate et al., 2013). However, precisely quantifying the extent of vesicular release inhibition from NM axons in the brainstem auditory circuit is technically problematic.

      One major limitation is that while A3V efficiently infects NM neurons, its transduction efficiency does not reach 100%. In electrophysiological evaluations, NL neurons receive inputs from multiple NM axons, meaning that responses may still include input from uninfected axons. Additionally, failure to evoke synaptic responses could either indicate successful silencing or failure to stimulate NM axons, making a clear distinction difficult. Furthermore, unlike in motor circuits, we cannot assess the effect of silencing by observing behavioral outputs.

      Thus, we instead opted to quantify the precise expression efficiency of GFP-tagged eTeNT in the cell bodies of NM neurons. The proportion of NM neurons expressing GFP-tagged eTeNT was 89.7 ± 1.6% (N = 6 chicks), which is consistent with previous reports evaluating A3V transduction efficiency in the brainstem auditory circuit (Matsui et al., 2012). These results strongly suggest that synaptic transmission from NM axons was globally silenced by eTeNT at the NL region. We described these explicitly in text (page 8, line 2).

      (2) Related to 1, can the authors clarify if their TeNT expression system results in the whole tract being silenced? It appears from Fig. 6 that their approach leads to sparse expression of TeNT in individual neurons, which enables them to measure myelination parameters. Can the authors discuss how silencing a single axon can lead to a regional effect in oligodendrocyte number?

      Figure 6D depicts a representative axon selected from a dense population of GFP-positive axons in a 200-μm-thick slice after A3V-eTeNT infection to bilateral NM. As shown in Supplementary Video 1 and 2, densely labeled GFP-positive axons can be traced along the main trunk. To prevent any misinterpretation, we have revised the description of Figure 6 in the main text and Figure legend (page 31, line 9), and stated the A3V-eTeNT infection efficiency was 89.7 ± 1.6% in NM neurons, as mentioned above. Based on this efficiency, we interpreted that the global occlusion of vesicular release from most of the NM axons altered the pericellular microenvironment of the NL region, which led to the regional effect on the oligodendrocyte density.

      On the other hand, your question regarding whether sparse expression of eTeNT still has an effect is highly relevant. As we also discussed in our reply to comment 4 by Reviewer #1, the relationship between neuronal activity and oligodendrocytes is highly diverse. In some types of axons, vesicular release is essential for normal myelination, and this process was disrupted by TeNT (Koudelka et al., 2016), suggesting that direct interaction with oligodendrocytes via vesicle release may actively promote myelination in these types of axons.

      To clarify whether the phenotype observed in Figure 6 arises from changes in the pericellular microenvironment at the NL region or from the direct suppression of axon-oligodendrocyte interactions, we included a new Supplementary Figure (Figure 6—figure supplement 1). In this figure, we evaluated the node formation on the axon sparsely expressing eTeNT by electroporation into the unilateral NM. The results showed that sparse eTeNT expression did not increase the percentages of heminodes or unmyelinated segments. This finding supports our conclusion that the increased unmyelinated segments by A3V-eTeNT resulted from impaired synaptic transmission at NM terminals and subsequent alterations of  pericellular microenvironment at the NL region.

      (3) The authors need to fully revise their statistical analyses throughout and supply additional information that is needed to assess if their analyses are adequate:

      Thank you for your valuable suggestions to improve the rigor of our statistical analyses. We have reanalyzed all statistical tests using R software. In the revised Methods section and Figure Legends, we have clarified the rationale for selecting each statistical test, specified which test was used for each figure, and explicitly defined both n and N. After reevaluation with the Shapiro-Wilk test, we adjusted some analyses to non-parametric tests where appropriate. However, these adjustments did not alter the statistical significance of our results compared to the original analyses.

      (3.1) the authors use a variety of statistical tests and it is not always obvious why they chose a particular test. For example, in Fig. 2G they chose a Kruskal-Wallis test instead of a two-way ANOVA or MannWhitney U test, which are much more common in the field. What is the rationale for the test choice?

      We have revised the explanation of our statistical test choices to provide greater clarity and precision. For example, in Figure 2G, we first assessed the normality of the data in each of the four groups using the Shapiro-Wilk test, which revealed that some datasets did not follow a normal distribution. Given this, we selected the Kruskal-Wallis test, a commonly used non-parametric test for comparisons across three or more groups. Since the Kruskal-Wallis test indicated a significant difference, we conducted a post hoc Steel-Dwass test to determine which specific group comparisons were statistically significant.

      (3.2) in some cases, the choice of test appears wholly inappropriate. For example, in Fig. 3H-K, an unpaired t-test is inappropriate if the two regions were analysed in the same samples. In Fig. 5, was a ttest used for comparisons between multiple groups in the same dataset? If so, an ANOVA may be more appropriate.

      In the case of Figures 3H-K, we compared oligodendrocyte morphology between regions. However, since the number of sparsely labeled oligodendrocytes differs both between regions and across individual samples, there is no strict correspondence between paired measurements. On the other hand, in Figures 5B, C, and E, we compared the density of labeled cells between regions within the same slice, establishing a direct correspondence between paired data points. For these comparisons, we appropriately used a paired t-test.

      (3.3) in some cases, the authors do not mention which test was used (Fig 3: E-G no test indicated, despite asterisks; G/L/M - which regression test that was used? What does r indicate?)

      We have specified the statistical tests used for each figure in the Methods section and Figure Legends for better clarity. Additionally, we have revised the descriptions for Figure 4G, L, and M and their corresponding Figure Legends to explicitly indicate that Spearman’s rank correlation coefficient (rₛ) was used for evaluation.

      (3.4) more concerningly, throughout the results, data may have been pseudo-replicated. t-tests and ANOVAs assume that each observation in a dataset is independent of the other observations. In figures 1-4 and 6 there is a very large "n" number, but the authors do not indicate what this corresponds to. This leaves it open to interpretation, and the large values suggest that the number of nodes, internodal segments, or cells may have been used. These are not independent experimental units, and should be averaged per independent biological replicate - i.e. per animal (N).

      We have now clarified what “n” represents in each figure, as well as the number of animals (N) used in each experiment, in the Figure Legends.

      In this study, developmental stages of chick embryos were defined by HH stage (Hamburger and Hamilton, 1951), minimizing individual variability. Additionally, since our study focuses on the distribution of morphological characteristics of individual cells, averaging measurements per animal would obscure important cellular-level variability and potentially mislead interpretation of data. Furthermore, we employed a strategy of sparse genetic labeling in many experiments, which naturally results in variability in the number of measurable cells per animal. Given the clear distinctions in our data distributions, we believe that averaging per biological replicate is not essential in this case.

      To further ensure the robustness of our statistical analysis, data presented as boxplots were preliminarily assessed using PlotsOfDifferences, a web-based application that calculates and visualizes effect sizes and 95% confidence intervals based on bootstrapping (https://huygens.science.uva.nl/PlotsOfDifferences/; https://doi.org/10.1101/578575). Effect sizes can serve as a valuable alternative to p-values (Ho, 2018; https://www.nature.com/articles/s41592019-0470-3). The significant differences reported in our study are also supported by clear differences in effect sizes, ensuring that our conclusions remain robust regardless of the statistical approach used.

      If requested, we would be happy to provide PlotsOfDifferences outputs as supplementary source data files, similar to those used in eLife publications, for each figure.

      (3.5) related to the pseudo-replication issue, can the authors include individual datapoints in graphs for full transparency, per biological replicates, in addition or in alternative to bar-graphs (e.g. Fig. 5 and 6).

      We have now incorporated individual data points into the bar graphs in Figures 5 and 6.

      (4) The main finding of the study is that the density of nodes differs between two regions of the chicken auditory circuit, probably due to morphological differences in the respective oligodendrocytes. Can the authors discuss if this finding is likely to be specific to the bird auditory circuit?

      The morphological differences of oligodendrocytes between white and gray matter are well established (i.e. shorter myelin at gray matter), but their correspondence with the nodal spacing pattern along the long axonal projections of cortical neurons is not well understood. Future research may find similarities with our findings. Additionally, as mentioned in the final section of the Discussion, the mammalian brainstem auditory circuit is functionally analogous to the avian ITD circuit. Regional differences in nodal spacing along axons have also been observed in the mammalian system, raising the important question of whether these differences are supported by regional heterogeneity in oligodendrocytes. Investigating this possibility will facilitate our understanding of the underlying logic and mechanisms for determining node spacing patterns along axons, as well as provide valuable insights into evolutionary convergence in auditory processing mechanisms. We described these explicitly in text (page 11, line 34).

      (5) Provided the authors amend their statistical analyses, and assuming significant differences remain as shown, the study shows a correlation (but not causation) between node spacing and oligodendrocyte density, but the authors did not manipulate oligodendrocyte density per se (i.e. cell-autonomously). Therefore, the authors should either include such experiments, or revise some of their phrasing to soften their claims and conclusions. For example, the word "determine" in the title could be replaced by "correlate with" for a more accurate representation of the work. Similar sentences throughout the main text should be amended.

      As you summarized in your comment, our results demonstrated that A3V-eTeNT suppressed oligodendrogenesis in the NL region, leading to a reduction in oligodendrocyte density (Figures 6L, M), which caused the emergence of unmyelinated segments. While this is an indirect manipulation of oligodendrocyte density, it nonetheless provides evidence supporting a causal relationship between oligodendrocyte density and nodal spacing.

      The emergence of unmyelinated segments at the NL region further suggests that the myelin extension capacity of oligodendrocytes differs between regions, highlighting regional differences in intrinsic properties of oligodendrocyte as the most prominent determinant of nodal spacing variation. However, as you correctly pointed out, our findings do not establish direct causation.

      In the future, developing methods to artificially manipulate myelin length could provide a more definitive demonstration of causality. Given these considerations, we have modified the title to replace "determine" with "underlie", ensuring that our conclusions are presented with appropriate nuance.

      (6) The authors fail to introduce, or discuss, very pertinent prior studies, in particular to contextualize their findings with:

      (6.1) known neuron-autonomous modes of node formation prior to myelination, e.g. Zonta et al (PMID 18573915); Vagionitis et al (PMID 35172135); Freeman et al (PMID 25561543)

      (6.2) known effects of vesicular fusion directly on myelinating capacity and oligodendrogenesis, e.g. Mensch et al (PMID 25849985)

      (6.3) known correlation of myelin length and thickness with axonal diameter, e.g. Murray & Blakemore (PMID 7012280); Ibrahim et al (PMID 8583214); Hildebrand et al (PMID 8441812).

      (6.4) regional heterogeneity in the oligodendrocyte transcriptome (page 9, studies summarized in PMID 36313617)

      Thank you for your insightful suggestions. We have incorporated the relevant references you provided and revised the manuscript accordingly to contextualize our findings within the existing literature.

      Minor comments:

      (7) Can the authors amend Fig. 1G with the correct units of measurement, not millimetres.

      Response: 

      Thank you for your suggestion. We have corrected the units in Figure 1G to µm

      (8) The Olig2 staining in Fig 2C does not appear to be nuclear, as would be expected of a transcription factor and as is well established for Olig2, but rather appears to be excluded from the nucleus, as it is in a ring or donut shape. Can the authors comment on this?

      Oligodendrocytes and OPCs have small cell bodies, often comparable in size to their nuclei. The central void in the ring-like Olig2 staining pattern appears too small to represent the nucleus. Additionally, a similar ring-like appearance is observed in BrdU labeling (Figure 5G), suggesting that this staining pattern may reflect nuclear morphology or other structural features.

      Significance

      In our view the study tackles a fundamental question likely to be of interest to a specialized audience of cellular neuroscientists. This descriptive study is suggestive that in the studied system, oligodendrocyte density determines the spacing between nodes of Ranvier, but further manipulations of oligodendrocyte density per se are needed to test this convincingly.

      The main finding of our study is that the primary determinant of the biased nodal spacing pattern in the sound localization circuit is the regional heterogeneity in the morphology of oligodendrocytes due to their intrinsic properties (e.g., their ability to produce and extend myelin sheaths) rather than the density of the cells. This was based on our observations that a reduction of oligodendrocyte density by A3V-eTeNT expression caused unmyelinated segments but did not increase internodal length (Figure 6), further revealing the importance of oligodendrocyte density in ensuring full myelination for the axons with short internodes. Thus, we think that our study could propose the significance of oligodendrocyte heterogeneity in the circuit function as well as in the nodal spacing using experimental manipulation of oligodendrocyte density. 

      Reviewer #3:

      Evidence, reproducibility and clarity

      The authors have investigated the myelination pattern along the axons of chick avian cochlear nucleus. It has already been shown that there are regional differences in the internodal length of axons in the nucleus magnocellularis. In the tract region across the midline, internodes are longer than in the nucleus laminaris region. Here the authors suggest that the difference in internodal length is attributed to heterogeneity of oligodendrocytes. In the tract region oligodendrocytes would contribute longer myelin internodes, while oligodendrocytes in the nucleus laminaris region would synthesize shorter myelin internodes. Not only length of myelin internodes differs, but also along the same axon unmyelinated areas between two internodes may vary. This is an interesting contribution since all these differences contribute to differential conduction velocity regulating ipsilateral and contralateral innervation of coincidence detector neurons. However, the demonstration falls rather short of being convincing. I have some major concerns:

      (1) The authors neglect the possibility that nodal cluster may be formed prior to myelin deposition. They have investigated stages E12 (no nodal clusters) and E15 (nodal cluster plus MAG+ myelin). Fig. 1D is of dubious quality. It would be important to investigate stages between E12 and E15 to observe the formation of pre-nodes, i.e., clustering of nodal components prior to myelin deposition.

      Thank you for your insightful comment regarding the potential role of pre-nodal clusters in determining internodal length. Indeed, studies in zebrafish have suggested that pre-nodal clustering of node components prior to myelination may prefigure internodal length (Vagionitis et al., 2022). We have incorporated a discussion on whether such pre-nodal clusters could contribute to regional differences in nodal spacing in our manuscript (page 9, line 35).

      Whether pre-nodal clusters are detectable before myelination appears to depend on neuronal subpopulation (Freeman et al., 2015). To investigate the presence of pre-nodal clusters along NM axons in the brainstem auditory circuit, we previously attempted to visualize AnkG signals at E13 and E14. However, we did not observe clear structures indicative of pre-nodal clusters; instead, we only detected sparse fibrous AnkG signals with weak Nav clustering at their ends, consistent with hemi-node features. This result does not exclude the possibility of pre-nodal clusters on NM axons, as the detection limit of immunostaining cannot be ruled out. In brainstem slices, where axons are densely packed, nodal molecules are expressed at low levels across a wide area, leading to a high background signal in immunostaining, which may mask weak pre-nodal cluster signals prior to myelination. Regarding the comment on Figure 1D, we assume you are referring to Figure 2D based on the context. The lack of clarity in the high-magnification images in Figure 2D results from both the high background signal and the limited penetration of the MAG antibody. Furthermore, we are unable to verify Neurofascin accumulation at pre-nodal clusters, as there is currently no commercially available antibody suitable for use in chickens, despite our over 20 years of efforts to identify one for AIS research. Therefore, current methodologies pose significant challenges in visualizing pre-nodal clusters in our model. Future advancements, such as exogenous expression of fluorescently tagged Neurofascin at appropriate densities or knock-in tagging of endogenous molecules, may help overcome these limitations.

      However, a key issue to be discussed in this study is not merely the presence or absence of prenodal clusters, but rather whether pre-nodal clusters—if present—would determine regional differences in internodal length. To address this possibility, we have added new data in Figure 6I, measuring the length of unmyelinated segments that emerged following A3V-eTeNT expression.

      If pre-nodal clusters were fixed before myelination and predetermined internodal length, then the length of unmyelinated segments should be equal to or a multiple of the typical internodal length. However, our data showed that unmyelinated segments in the NL region were less than half the length of the typical NL internodal length, contradicting the hypothesis that fixed pre-nodal clusters determine internodal length along NM axons in this region.

      (2) The claim that axonal diameter is constant along the axonal length need to be demonstrated at the EM level. This would also allow to measure possible regional differences in the thickness of the myelin sheath and number of myelin wraps.

      As mentioned in our reply to comment 2 by Reviewer #1, the diameter of NM axons was already evaluated using electron microscopy (EM) in the pioneering study by Seidl et al., (2010). Additionally, EM-based analysis makes it difficult to clearly distinguish between the main trunk of NM axons and thin collateral branches at the NL region. Accordingly, we did not do the EM analysis in this revision. 

      In Figure 4, we used palGFP, which is targeted to the cell membrane, allowing us to measure axon diameter by evaluating the distance between two membrane signal peaks. This approach minimizes the influence of the blurring of fluorescence signals on diameter measurements. Thus, we believe that our method is sufficient to evaluate the relative difference in axon diameters between regions and hence to show that axon diameter is not the primary determinant of the 3-fold difference in internodal length between regions. 

      (3) The observation that internodal length differs is explain by heterogeneity of sources of oligodendrocyte is not convincing. Oligodendrocytes a priori from the same origin remyelinate shorter internode after a demyelination event.

      The heterogeneity in oligodendrocyte morphology would reflect differences in gene profiles, which, in turn, may arise from differences in their developmental origin and/or pericellular microenvironment of OPCs. We made this point as clear as possible in Discussion (page 9, line 21).

      Significance

      The authors suggest that the difference in internodal length is attributed to heterogeneity of oligodendrocytes. In the tract region oligodendrocytes would contribute longer myelin internodes, while oligodendrocytes in the nucleus laminaris region would synthesize shorter myelin internodes. Not only length of myelin internodes differs, but also along the same axon unmyelinated areas between two internodes may vary. This is an interesting contribution since all these differences contribute to differential conduction velocity regulating ipsilateral and contralateral innervation of coincidence detector neurons.

    1. Réunion pour les parents d'élèves de Terminale Générale au Lycée Louis Vincent : Synthèse et Points Clés

      Cette note de synthèse résume les informations cruciales présentées lors de la réunion des parents d'élèves de terminale générale au Lycée Louis Vincent.

      L'année de terminale est une année charnière, marquée par deux enjeux majeurs : la réussite au baccalauréat et la préparation de l'orientation post-bac via la plateforme Parcoursup.

      Le baccalauréat se compose de 40% de contrôle continu, encadré par un projet d'évaluation strict pour garantir l'équité, et de 60% d'épreuves terminales.

      Le Grand Oral (coefficient 10) représente une opportunité stratégique majeure. Le lycée met en place un programme de préparation intensif avec des devoirs communs et des examens blancs, principalement le samedi matin.


      Parallèlement, le processus Parcoursup est présenté comme un outil indispensable mais complexe, exigeant une préparation dès le début de l'année.

      Les élèves sont invités à utiliser des ressources comme le site SupTracker pour analyser les statistiques d'admission et à consulter les psychologues de l'Éducation nationale (Psy-EN).

      L'accent est mis sur l'importance capitale d'un dossier scolaire solide, où les appréciations des enseignants, l'assiduité et le comportement sont aussi déterminants que les notes.

      La direction insiste sur le fait que, si 100% des élèves de l'établissement ont reçu une proposition sur Parcoursup l'an dernier, l'obtention du vœu prioritaire dépend de l'adéquation entre le projet de l'élève, ses résultats et la qualité de son dossier.


      1. Introduction et Présentation des Acteurs Pédagogiques

      La réunion a débuté par la présentation des équipes pédagogiques et administratives qui encadrent les 299 élèves de terminale générale.

      • Direction :

      • Mme X est Proviseure adjointe en charge du niveau terminale et de la gestion des examens.

      • Mme Z est la nouvelle proviseure adjointe, en charge des niveaux première et BTS.

      • Formations Technologiques et Industrielles :

      • Mme C est Directrice déléguée aux formations (Laboratoire, STL, BTS CIRA, BTS Métiers de la chimie).

      • M. R est Directeur délégué aux formations techniques industrielles, soulignant l'accueil favorable des bacheliers généraux dans les filières BTS.

      • Conseillers Principaux d'Éducation (CPE) : L'équipe de trois CPE, incluant Mme L et Mme B, se partage le suivi des classes de terminale.

      • Professeurs Principaux : Il est précisé qu'un binôme de professeurs principaux est assigné à chaque classe de terminale, l'un se concentrant sur la gestion de la classe et l'autre sur l'orientation, avec une répartition flexible des missions.


      2. Le Baccalauréat 2026 : Modalités d'Évaluation

      L'évaluation du baccalauréat est structurée autour de deux composantes principales, avec un rappel des excellents résultats de l'établissement lors de la session précédente.

      Structure de la Note Finale

      • Le Contrôle Continu représente 40% de la note finale.
      • Les Épreuves Terminales représentent 60% de la note finale.

      Résultats du Lycée Louis-Vincent (Session Juin 2025)

      L'établissement affiche des taux de réussite élevés, témoignant de la qualité de l'accompagnement.

      Voie Générale : Taux de Réussite > 98 %. Voie Générale : Taux de Mention 74 %. Voie Technologique : Taux de Réussite 93 % - 96 %. Voie Technologique : Taux de Mention ≈ 50 %.

      Le Contrôle Continu (40%)

      • Coefficients : L'ensemble des matières suivies en terminale compte pour un total de 19 coefficients dans le calcul du contrôle continu.

      • Projet d'Évaluation : Le lycée a mis en place un projet d'évaluation pour "garantir une égalité de traitement" et des principes communs. *Les moyennes sont validées en conseil de classe et transmises via le Livret Scolaire du Lycéen (LSL).

      • Politique sur les Absences : Une politique stricte est appliquée pour contrer les absences stratégiques visant à éviter une évaluation. Les élèves absents se voient offrir "l'opportunité et non pas la punition" de rattraper les devoirs manqués le samedi matin. Si un élève a trop peu de notes, celles-ci peuvent être jugées "non robustes" par l'Inspection Générale, entraînant une épreuve ponctuelle individuelle pour valider le niveau.

      Les Épreuves Terminales (60%)

      Les épreuves finales se dérouleront principalement en juin. Les coefficients pour la voie générale sont les suivants :

      • Épreuve de Français (passé en Première) : 10 (5 écrit, 5 oral).
      • Épreuve de Philosophie : 8.
      • Épreuve d'Enseignement de Spécialité 1 : 16.
      • Épreuve d'Enseignement de Spécialité 2 : 16.
      • Épreuve du Grand Oral : 10.

      3. Focus sur le Grand Oral

      Le Grand Oral est présenté comme une épreuve d'une importance capitale, tant pour son coefficient que pour les compétences qu'il évalue.

      • Objectifs : Apprendre à s'exprimer en public de façon claire et convaincante, évaluer les capacités d'argumentation, l'esprit critique et les connaissances liées aux deux enseignements de spécialité.

      • Déroulement : Les élèves préparent deux questions. *Le jour de l'épreuve, ils disposent d'un temps de préparation avant un entretien de 20 minutes avec le jury (10 minutes de présentation/réponses et 10 minutes d'échange).

      • Importance Stratégique : Il est souligné que le Grand Oral est une épreuve avec un "rendement de notes" élevé. "Avoir 20 au grand oral c'est possible [...] c'est assez courant", contrairement à d'autres disciplines. C'est donc un levier essentiel pour obtenir une mention ou sécuriser l'obtention du diplôme.


      4. Préparation et Accompagnement au Lycée

      L'établissement organise un calendrier de préparation pour accompagner les élèves vers la réussite.

      • Devoirs Communs et Bacs Blancs : Un calendrier de préparation est en cours de finalisation. *Il inclura des devoirs communs et des bacs blancs, majoritairement organisés le samedi matin.

      • Journées Banalisées : Deux journées seront banalisées en avril (probablement au retour des vacances) pour les épreuves blanches des enseignements de spécialité.

      • Oral Blanc : Un oral blanc sera organisé pour préparer spécifiquement le Grand Oral.

      • Importance de l'Entraînement : La direction insiste sur le fait que s'entraîner en conditions réelles est "indispensable et primordial" pour apprendre à gérer le temps, le stress et l'environnement d'une grande salle d'examen.


      5. L'Orientation et Parcoursup : Un Processus Stratégique

      L'orientation est l'autre grand chantier de l'année, nécessitant une implication précoce et continue des élèves et de leurs familles.

      Événements Clés pour l'Orientation

      • Un Jour à l'Université (UJALU) : En octobre, pendant les vacances. *Les inscriptions débutent le 29 septembre.

      • Salon Oriaction : Les 20, 21 et 22 novembre à Nancy. Le lycée n'organise pas de déplacement collectif ; les familles sont encouragées à s'y rendre, notamment le samedi, pour rencontrer des enseignants du supérieur. Le salon présente 5 000 formations.

      • Forum des formations du Lycée Louis-Vincent : Le 6 février 2025, pour rencontrer étudiants, professionnels et anciens élèves.

      Le Calendrier Parcoursup

      Bien que le calendrier officiel ne soit pas publié, les trois grandes étapes restent les mêmes :

      1. Décembre - Janvier : Ouverture du site avec les informations mises à jour pour la rentrée 2026. Phase d'information.

      2. Mi-Janvier - Mi-Mars : Phase d'inscription et de formulation des vœux. *La date limite de mi-mars est impérative pour ajouter de nouveaux vœux.

      3. Début Juin - Début Juillet : Phase principale d'admission avec réception des réponses des formations.

      Outils et Stratégies pour Parcoursup

      • Ressources Essentielles :

      • Psy-EN : Les élèves, surtout ceux qui sont indécis, sont vivement encouragés à prendre rendez-vous "dès maintenant" avant que les créneaux ne soient saturés.

      • SupTracker : Cet outil statistique est présenté comme "indispensable". Il permet de voir quels profils (spécialités, mentions, notes) ont été admis dans une formation donnée les années précédentes. Par exemple, pour la formation PASS (médecine), 90% des admis en 2025 avaient une combinaison des spécialités Mathématiques, Physique-Chimie ou SVT.

      • Site Parcoursup : La plateforme contient des fiches détaillées sur plus de 24 000 formations, incluant les taux d'accès et les notes moyennes des derniers admis.

      • L'Importance Capitale du Dossier Scolaire :

      • Le proviseur insiste sur le fait que la sélection n'est pas faite par une intelligence artificielle mais par des équipes humaines.

      • Les appréciations des professeurs sont cruciales. *Des remarques comme "travailleur, investi, capable" sont des atouts majeurs.

      • À l'inverse, les absences, les retards et les remarques sur le comportement sont "vraiment très bloquants". *Un dossier avec ces éléments est souvent mis de côté d'emblée par les comités de sélection.

      • Conseils Stratégiques :

      • La devise est : "Il vaut mieux pouvoir choisir qu'être obligé de choisir." Pour cela, il faut que les résultats de l'élève soient à la hauteur de ses ambitions.

      • Il est conseillé d'élargir au maximum le champ des vœux avant la date limite de mi-mars pour ne fermer aucune porte.

      • La phase de résultats en juin est reconnue comme une période de stress intense, notamment à cause des listes d'attente. *Il est rappelé qu'il faut analyser sa position en la comparant au rang du dernier admis de l'année précédente, une information disponible sur Parcoursup.


      6. Questions Diverses

      • Section Euro Allemand / DNL : Cette option est valorisée par une "mention européenne" sur le diplôme du bac. *Les notes et appréciations sont prises en compte dans le dossier Parcoursup et sont visibles par les établissements du supérieur.

      • Heure d'Orientation : L'heure hebdomadaire dédiée à l'orientation n'est pas systématiquement une séance en classe entière. *Il s'agit d'un volume annuel d'environ 10 à 19 séances par professeur principal, qui peuvent prendre la forme de réunions plénières, de travail en petits groupes ou d'entretiens individuels selon les besoins des élèves.

    1. Sovereignty . Autonomy . Independence .

      <svg fill="none" height="220" viewBox="0 0 220 220" width="220" xmlns="http://www.w3.org/2000/svg"> <g clip-path="url(#clip0_360_41755)"> <path d="M86 58.5061C87.1046 58.5061 88 57.6107 88 56.5061C88 55.4015 87.1046 54.5061 86 54.5061C84.8954 54.5061 84 55.4015 84 56.5061C84 57.6107 84.8954 58.5061 86 58.5061Z" fill="#0DBD8B"> </path> <path d="M122 58.5061C123.105 58.5061 124 57.6107 124 56.5061C124 55.4015 123.105 54.5061 122 54.5061C120.895 54.5061 120 55.4015 120 56.5061C120 57.6107 120.895 58.5061 122 58.5061Z" fill="#0DBD8B"> </path> <path d="M118 44.5061C119.105 44.5061 120 43.6107 120 42.5061C120 41.4015 119.105 40.5061 118 40.5061C116.895 40.5061 116 41.4015 116 42.5061C116 43.6107 116.895 44.5061 118 44.5061Z" fill="#0DBD8B"> </path> <path d="M104 76.5061C105.105 76.5061 106 75.6107 106 74.5061C106 73.4015 105.105 72.5061 104 72.5061C102.895 72.5061 102 73.4015 102 74.5061C102 75.6107 102.895 76.5061 104 76.5061Z" fill="#0DBD8B"> </path> <path d="M104 40.5061C105.105 40.5061 106 39.6107 106 38.5061C106 37.4015 105.105 36.5061 104 36.5061C102.895 36.5061 102 37.4015 102 38.5061C102 39.6107 102.895 40.5061 104 40.5061Z" fill="#0DBD8B"> </path> <path d="M90 72.5061C91.1046 72.5061 92 71.6107 92 70.5061C92 69.4015 91.1046 68.5061 90 68.5061C88.8954 68.5061 88 69.4015 88 70.5061C88 71.6107 88.8954 72.5061 90 72.5061Z" fill="#0DBD8B"> </path> <path d="M90 44.5061C91.1046 44.5061 92 43.6107 92 42.5061C92 41.4015 91.1046 40.5061 90 40.5061C88.8954 40.5061 88 41.4015 88 42.5061C88 43.6107 88.8954 44.5061 90 44.5061Z" fill="#0DBD8B"> </path> <path d="M118 72.5061C119.105 72.5061 120 71.6107 120 70.5061C120 69.4015 119.105 68.5061 118 68.5061C116.895 68.5061 116 69.4015 116 70.5061C116 71.6107 116.895 72.5061 118 72.5061Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="92" y="44.5061"> </rect> <path d="M108.666 55.8394L99.3332 55.8394C98.6 55.8394 98 55.2394 98 54.5061L98 51.8394C98 51.1061 98.6 50.5061 99.3332 50.5061L108.666 50.5061C109.399 50.5061 109.999 51.1061 109.999 51.8394L109.999 54.5061C109.999 55.2394 109.399 55.8394 108.666 55.8394ZM100.666 51.8394C99.9332 51.8394 99.3332 52.4394 99.3332 53.1728C99.3332 53.9061 99.9332 54.5061 100.666 54.5061C101.4 54.5061 102 53.9061 102 53.1728C102 52.4394 101.4 51.8394 100.666 51.8394ZM108.666 62.5061L99.3332 62.5061C98.6 62.5061 98 61.9061 98 61.1728L98 58.5061C98 57.7728 98.6 57.1728 99.3332 57.1728L108.666 57.1728C109.399 57.1728 109.999 57.7728 109.999 58.5061L109.999 61.1728C109.999 61.9061 109.399 62.5061 108.666 62.5061ZM100.666 58.5061C99.9332 58.5061 99.3332 59.1061 99.3332 59.8394C99.3332 60.5728 99.9332 61.1728 100.666 61.1728C101.4 61.1728 102 60.5728 102 59.8394C102 59.1061 101.4 58.5061 100.666 58.5061Z" fill="#F0F2F5"> </path> <path d="M64 129.506C65.1046 129.506 66 128.611 66 127.506C66 126.402 65.1046 125.506 64 125.506C62.8954 125.506 62 126.402 62 127.506C62 128.611 62.8954 129.506 64 129.506Z" fill="#0DBD8B"> </path> <path d="M100 129.506C101.105 129.506 102 128.611 102 127.506C102 126.402 101.105 125.506 100 125.506C98.8954 125.506 98 126.402 98 127.506C98 128.611 98.8954 129.506 100 129.506Z" fill="#0DBD8B"> </path> <path d="M96 115.506C97.1046 115.506 98 114.611 98 113.506C98 112.402 97.1046 111.506 96 111.506C94.8954 111.506 94 112.402 94 113.506C94 114.611 94.8954 115.506 96 115.506Z" fill="#0DBD8B"> </path> <path d="M82 147.506C83.1046 147.506 84 146.611 84 145.506C84 144.402 83.1046 143.506 82 143.506C80.8954 143.506 80 144.402 80 145.506C80 146.611 80.8954 147.506 82 147.506Z" fill="#0DBD8B"> </path> <path d="M82 111.506C83.1046 111.506 84 110.611 84 109.506C84 108.402 83.1046 107.506 82 107.506C80.8954 107.506 80 108.402 80 109.506C80 110.611 80.8954 111.506 82 111.506Z" fill="#0DBD8B"> </path> <path d="M68 143.506C69.1046 143.506 70 142.611 70 141.506C70 140.402 69.1046 139.506 68 139.506C66.8954 139.506 66 140.402 66 141.506C66 142.611 66.8954 143.506 68 143.506Z" fill="#0DBD8B"> </path> <path d="M68 115.506C69.1046 115.506 70 114.611 70 113.506C70 112.402 69.1046 111.506 68 111.506C66.8954 111.506 66 112.402 66 113.506C66 114.611 66.8954 115.506 68 115.506Z" fill="#0DBD8B"> </path> <path d="M96 143.506C97.1046 143.506 98 142.611 98 141.506C98 140.402 97.1046 139.506 96 139.506C94.8954 139.506 94 140.402 94 141.506C94 142.611 94.8954 143.506 96 143.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="70" y="115.506"> </rect> <path d="M86.6658 126.839L77.3332 126.839C76.6 126.839 76 126.239 76 125.506L76 122.839C76 122.106 76.6 121.506 77.3332 121.506L86.6658 121.506C87.3991 121.506 87.9991 122.106 87.9991 122.839L87.9991 125.506C87.9991 126.239 87.3991 126.839 86.6658 126.839ZM78.6665 122.839C77.9332 122.839 77.3332 123.439 77.3332 124.173C77.3332 124.906 77.9332 125.506 78.6665 125.506C79.3997 125.506 79.9997 124.906 79.9997 124.173C79.9997 123.439 79.3997 122.839 78.6665 122.839ZM86.6658 133.506L77.3332 133.506C76.6 133.506 76 132.906 76 132.173L76 129.506C76 128.773 76.6 128.173 77.3332 128.173L86.6658 128.173C87.3991 128.173 87.9991 128.773 87.9991 129.506L87.9991 132.173C87.9991 132.906 87.3991 133.506 86.6658 133.506ZM78.6665 129.506C77.9332 129.506 77.3332 130.106 77.3332 130.839C77.3332 131.573 77.9332 132.173 78.6665 132.173C79.3997 132.173 79.9997 131.573 79.9997 130.839C79.9997 130.106 79.3997 129.506 78.6665 129.506Z" fill="#F0F2F5"> </path> <path d="M2 107.506C3.10457 107.506 4 106.611 4 105.506C4 104.402 3.10457 103.506 2 103.506C0.89543 103.506 0 104.402 0 105.506C0 106.611 0.89543 107.506 2 107.506Z" fill="#0DBD8B"> </path> <path d="M38 107.506C39.1046 107.506 40 106.611 40 105.506C40 104.402 39.1046 103.506 38 103.506C36.8954 103.506 36 104.402 36 105.506C36 106.611 36.8954 107.506 38 107.506Z" fill="#0DBD8B"> </path> <path d="M34 93.5061C35.1046 93.5061 36 92.6107 36 91.5061C36 90.4015 35.1046 89.5061 34 89.5061C32.8954 89.5061 32 90.4015 32 91.5061C32 92.6107 32.8954 93.5061 34 93.5061Z" fill="#0DBD8B"> </path> <path d="M20 125.506C21.1046 125.506 22 124.611 22 123.506C22 122.402 21.1046 121.506 20 121.506C18.8954 121.506 18 122.402 18 123.506C18 124.611 18.8954 125.506 20 125.506Z" fill="#0DBD8B"> </path> <path d="M20 89.5061C21.1046 89.5061 22 88.6107 22 87.5061C22 86.4015 21.1046 85.5061 20 85.5061C18.8954 85.5061 18 86.4015 18 87.5061C18 88.6107 18.8954 89.5061 20 89.5061Z" fill="#0DBD8B"> </path> <path d="M6 121.506C7.10457 121.506 8 120.611 8 119.506C8 118.402 7.10457 117.506 6 117.506C4.89543 117.506 4 118.402 4 119.506C4 120.611 4.89543 121.506 6 121.506Z" fill="#0DBD8B"> </path> <path d="M6 93.5061C7.10457 93.5061 8 92.6107 8 91.5061C8 90.4015 7.10457 89.5061 6 89.5061C4.89543 89.5061 4 90.4015 4 91.5061C4 92.6107 4.89543 93.5061 6 93.5061Z" fill="#0DBD8B"> </path> <path d="M34 121.506C35.1046 121.506 36 120.611 36 119.506C36 118.402 35.1046 117.506 34 117.506C32.8954 117.506 32 118.402 32 119.506C32 120.611 32.8954 121.506 34 121.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="8" y="93.5061"> </rect> <path d="M24.6658 104.839L15.3332 104.839C14.6 104.839 14 104.239 14 103.506L14 100.839C14 100.106 14.6 99.5061 15.3332 99.5061L24.6658 99.5061C25.3991 99.5061 25.9991 100.106 25.9991 100.839L25.9991 103.506C25.9991 104.239 25.3991 104.839 24.6658 104.839ZM16.6665 100.839C15.9332 100.839 15.3332 101.439 15.3332 102.173C15.3332 102.906 15.9332 103.506 16.6665 103.506C17.3997 103.506 17.9997 102.906 17.9997 102.173C17.9997 101.439 17.3997 100.839 16.6665 100.839ZM24.6658 111.506L15.3332 111.506C14.6 111.506 14 110.906 14 110.173L14 107.506C14 106.773 14.6 106.173 15.3332 106.173L24.6658 106.173C25.3991 106.173 25.9991 106.773 25.9991 107.506L25.9991 110.173C25.9991 110.906 25.3991 111.506 24.6658 111.506ZM16.6665 107.506C15.9332 107.506 15.3332 108.106 15.3332 108.839C15.3332 109.573 15.9332 110.173 16.6665 110.173C17.3997 110.173 17.9997 109.573 17.9997 108.839C17.9997 108.106 17.3997 107.506 16.6665 107.506Z" fill="#F0F2F5"> </path> <path d="M15 161.506C16.1046 161.506 17 160.611 17 159.506C17 158.402 16.1046 157.506 15 157.506C13.8954 157.506 13 158.402 13 159.506C13 160.611 13.8954 161.506 15 161.506Z" fill="#0DBD8B"> </path> <path d="M51 161.506C52.1046 161.506 53 160.611 53 159.506C53 158.402 52.1046 157.506 51 157.506C49.8954 157.506 49 158.402 49 159.506C49 160.611 49.8954 161.506 51 161.506Z" fill="#0DBD8B"> </path> <path d="M47 147.506C48.1046 147.506 49 146.611 49 145.506C49 144.402 48.1046 143.506 47 143.506C45.8954 143.506 45 144.402 45 145.506C45 146.611 45.8954 147.506 47 147.506Z" fill="#0DBD8B"> </path> <path d="M33 179.506C34.1046 179.506 35 178.611 35 177.506C35 176.402 34.1046 175.506 33 175.506C31.8954 175.506 31 176.402 31 177.506C31 178.611 31.8954 179.506 33 179.506Z" fill="#0DBD8B"> </path> <path d="M33 143.506C34.1046 143.506 35 142.611 35 141.506C35 140.402 34.1046 139.506 33 139.506C31.8954 139.506 31 140.402 31 141.506C31 142.611 31.8954 143.506 33 143.506Z" fill="#0DBD8B"> </path> <path d="M19 175.506C20.1046 175.506 21 174.611 21 173.506C21 172.402 20.1046 171.506 19 171.506C17.8954 171.506 17 172.402 17 173.506C17 174.611 17.8954 175.506 19 175.506Z" fill="#0DBD8B"> </path> <path d="M19 147.506C20.1046 147.506 21 146.611 21 145.506C21 144.402 20.1046 143.506 19 143.506C17.8954 143.506 17 144.402 17 145.506C17 146.611 17.8954 147.506 19 147.506Z" fill="#0DBD8B"> </path> <path d="M47 175.506C48.1046 175.506 49 174.611 49 173.506C49 172.402 48.1046 171.506 47 171.506C45.8954 171.506 45 172.402 45 173.506C45 174.611 45.8954 175.506 47 175.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="21" y="147.506"> </rect> <path d="M37.6658 158.839L28.3332 158.839C27.6 158.839 27 158.239 27 157.506L27 154.839C27 154.106 27.6 153.506 28.3332 153.506L37.6658 153.506C38.3991 153.506 38.9991 154.106 38.9991 154.839L38.9991 157.506C38.9991 158.239 38.3991 158.839 37.6658 158.839ZM29.6665 154.839C28.9332 154.839 28.3332 155.439 28.3332 156.173C28.3332 156.906 28.9332 157.506 29.6665 157.506C30.3997 157.506 30.9997 156.906 30.9997 156.173C30.9997 155.439 30.3997 154.839 29.6665 154.839ZM37.6658 165.506L28.3332 165.506C27.6 165.506 27 164.906 27 164.173L27 161.506C27 160.773 27.6 160.173 28.3332 160.173L37.6658 160.173C38.3991 160.173 38.9991 160.773 38.9991 161.506L38.9991 164.173C38.9991 164.906 38.3991 165.506 37.6658 165.506ZM29.6665 161.506C28.9332 161.506 28.3332 162.106 28.3332 162.839C28.3332 163.573 28.9332 164.173 29.6665 164.173C30.3997 164.173 30.9997 163.573 30.9997 162.839C30.9997 162.106 30.3997 161.506 29.6665 161.506Z" fill="#F0F2F5"> </path> <path d="M138 189.506C139.105 189.506 140 188.611 140 187.506C140 186.402 139.105 185.506 138 185.506C136.895 185.506 136 186.402 136 187.506C136 188.611 136.895 189.506 138 189.506Z" fill="#0DBD8B"> </path> <path d="M174 189.506C175.105 189.506 176 188.611 176 187.506C176 186.402 175.105 185.506 174 185.506C172.895 185.506 172 186.402 172 187.506C172 188.611 172.895 189.506 174 189.506Z" fill="#0DBD8B"> </path> <path d="M170 175.506C171.105 175.506 172 174.611 172 173.506C172 172.402 171.105 171.506 170 171.506C168.895 171.506 168 172.402 168 173.506C168 174.611 168.895 175.506 170 175.506Z" fill="#0DBD8B"> </path> <path d="M156 207.506C157.105 207.506 158 206.611 158 205.506C158 204.402 157.105 203.506 156 203.506C154.895 203.506 154 204.402 154 205.506C154 206.611 154.895 207.506 156 207.506Z" fill="#0DBD8B"> </path> <path d="M156 171.506C157.105 171.506 158 170.611 158 169.506C158 168.402 157.105 167.506 156 167.506C154.895 167.506 154 168.402 154 169.506C154 170.611 154.895 171.506 156 171.506Z" fill="#0DBD8B"> </path> <path d="M142 203.506C143.105 203.506 144 202.611 144 201.506C144 200.402 143.105 199.506 142 199.506C140.895 199.506 140 200.402 140 201.506C140 202.611 140.895 203.506 142 203.506Z" fill="#0DBD8B"> </path> <path d="M142 175.506C143.105 175.506 144 174.611 144 173.506C144 172.402 143.105 171.506 142 171.506C140.895 171.506 140 172.402 140 173.506C140 174.611 140.895 175.506 142 175.506Z" fill="#0DBD8B"> </path> <path d="M170 203.506C171.105 203.506 172 202.611 172 201.506C172 200.402 171.105 199.506 170 199.506C168.895 199.506 168 200.402 168 201.506C168 202.611 168.895 203.506 170 203.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="144" y="175.506"> </rect> <path d="M160.666 186.839L151.333 186.839C150.6 186.839 150 186.239 150 185.506L150 182.839C150 182.106 150.6 181.506 151.333 181.506L160.666 181.506C161.399 181.506 161.999 182.106 161.999 182.839L161.999 185.506C161.999 186.239 161.399 186.839 160.666 186.839ZM152.666 182.839C151.933 182.839 151.333 183.439 151.333 184.173C151.333 184.906 151.933 185.506 152.666 185.506C153.4 185.506 154 184.906 154 184.173C154 183.439 153.4 182.839 152.666 182.839ZM160.666 193.506L151.333 193.506C150.6 193.506 150 192.906 150 192.173L150 189.506C150 188.773 150.6 188.173 151.333 188.173L160.666 188.173C161.399 188.173 161.999 188.773 161.999 189.506L161.999 192.173C161.999 192.906 161.399 193.506 160.666 193.506ZM152.666 189.506C151.933 189.506 151.333 190.106 151.333 190.839C151.333 191.573 151.933 192.173 152.666 192.173C153.4 192.173 154 191.573 154 190.839C154 190.106 153.4 189.506 152.666 189.506Z" fill="#F0F2F5"> </path> <path d="M126 107.506C127.105 107.506 128 106.611 128 105.506C128 104.402 127.105 103.506 126 103.506C124.895 103.506 124 104.402 124 105.506C124 106.611 124.895 107.506 126 107.506Z" fill="#0DBD8B"> </path> <path d="M162 107.506C163.105 107.506 164 106.611 164 105.506C164 104.402 163.105 103.506 162 103.506C160.895 103.506 160 104.402 160 105.506C160 106.611 160.895 107.506 162 107.506Z" fill="#0DBD8B"> </path> <path d="M158 93.5061C159.105 93.5061 160 92.6107 160 91.5061C160 90.4015 159.105 89.5061 158 89.5061C156.895 89.5061 156 90.4015 156 91.5061C156 92.6107 156.895 93.5061 158 93.5061Z" fill="#0DBD8B"> </path> <path d="M144 125.506C145.105 125.506 146 124.611 146 123.506C146 122.402 145.105 121.506 144 121.506C142.895 121.506 142 122.402 142 123.506C142 124.611 142.895 125.506 144 125.506Z" fill="#0DBD8B"> </path> <path d="M144 89.5061C145.105 89.5061 146 88.6107 146 87.5061C146 86.4015 145.105 85.5061 144 85.5061C142.895 85.5061 142 86.4015 142 87.5061C142 88.6107 142.895 89.5061 144 89.5061Z" fill="#0DBD8B"> </path> <path d="M130 121.506C131.105 121.506 132 120.611 132 119.506C132 118.402 131.105 117.506 130 117.506C128.895 117.506 128 118.402 128 119.506C128 120.611 128.895 121.506 130 121.506Z" fill="#0DBD8B"> </path> <path d="M130 93.5061C131.105 93.5061 132 92.6107 132 91.5061C132 90.4015 131.105 89.5061 130 89.5061C128.895 89.5061 128 90.4015 128 91.5061C128 92.6107 128.895 93.5061 130 93.5061Z" fill="#0DBD8B"> </path> <path d="M158 121.506C159.105 121.506 160 120.611 160 119.506C160 118.402 159.105 117.506 158 117.506C156.895 117.506 156 118.402 156 119.506C156 120.611 156.895 121.506 158 121.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="132" y="93.5061"> </rect> <path d="M148.666 104.839L139.333 104.839C138.6 104.839 138 104.239 138 103.506L138 100.839C138 100.106 138.6 99.5061 139.333 99.5061L148.666 99.5061C149.399 99.5061 149.999 100.106 149.999 100.839L149.999 103.506C149.999 104.239 149.399 104.839 148.666 104.839ZM140.666 100.839C139.933 100.839 139.333 101.439 139.333 102.173C139.333 102.906 139.933 103.506 140.666 103.506C141.4 103.506 142 102.906 142 102.173C142 101.439 141.4 100.839 140.666 100.839ZM148.666 111.506L139.333 111.506C138.6 111.506 138 110.906 138 110.173L138 107.506C138 106.773 138.6 106.173 139.333 106.173L148.666 106.173C149.399 106.173 149.999 106.773 149.999 107.506L149.999 110.173C149.999 110.906 149.399 111.506 148.666 111.506ZM140.666 107.506C139.933 107.506 139.333 108.106 139.333 108.839C139.333 109.573 139.933 110.173 140.666 110.173C141.4 110.173 142 109.573 142 108.839C142 108.106 141.4 107.506 140.666 107.506Z" fill="#F0F2F5"> </path> <path d="M178 116.506C179.105 116.506 180 115.611 180 114.506C180 113.402 179.105 112.506 178 112.506C176.895 112.506 176 113.402 176 114.506C176 115.611 176.895 116.506 178 116.506Z" fill="#0DBD8B"> </path> <path d="M214 116.506C215.105 116.506 216 115.611 216 114.506C216 113.402 215.105 112.506 214 112.506C212.895 112.506 212 113.402 212 114.506C212 115.611 212.895 116.506 214 116.506Z" fill="#0DBD8B"> </path> <path d="M210 102.506C211.105 102.506 212 101.611 212 100.506C212 99.4015 211.105 98.5061 210 98.5061C208.895 98.5061 208 99.4015 208 100.506C208 101.611 208.895 102.506 210 102.506Z" fill="#0DBD8B"> </path> <path d="M196 134.506C197.105 134.506 198 133.611 198 132.506C198 131.402 197.105 130.506 196 130.506C194.895 130.506 194 131.402 194 132.506C194 133.611 194.895 134.506 196 134.506Z" fill="#0DBD8B"> </path> <path d="M196 98.5061C197.105 98.5061 198 97.6107 198 96.5061C198 95.4015 197.105 94.5061 196 94.5061C194.895 94.5061 194 95.4015 194 96.5061C194 97.6107 194.895 98.5061 196 98.5061Z" fill="#0DBD8B"> </path> <path d="M182 130.506C183.105 130.506 184 129.611 184 128.506C184 127.402 183.105 126.506 182 126.506C180.895 126.506 180 127.402 180 128.506C180 129.611 180.895 130.506 182 130.506Z" fill="#0DBD8B"> </path> <path d="M182 102.506C183.105 102.506 184 101.611 184 100.506C184 99.4015 183.105 98.5061 182 98.5061C180.895 98.5061 180 99.4015 180 100.506C180 101.611 180.895 102.506 182 102.506Z" fill="#0DBD8B"> </path> <path d="M210 130.506C211.105 130.506 212 129.611 212 128.506C212 127.402 211.105 126.506 210 126.506C208.895 126.506 208 127.402 208 128.506C208 129.611 208.895 130.506 210 130.506Z" fill="#0DBD8B"> </path> <rect fill="#002311" height="24" rx="4" width="24" x="184" y="102.506"> </rect> <path d="M200.666 113.839L191.333 113.839C190.6 113.839 190 113.239 190 112.506L190 109.839C190 109.106 190.6 108.506 191.333 108.506L200.666 108.506C201.399 108.506 201.999 109.106 201.999 109.839L201.999 112.506C201.999 113.239 201.399 113.839 200.666 113.839ZM192.666 109.839C191.933 109.839 191.333 110.439 191.333 111.173C191.333 111.906 191.933 112.506 192.666 112.506C193.4 112.506 194 111.906 194 111.173C194 110.439 193.4 109.839 192.666 109.839ZM200.666 120.506L191.333 120.506C190.6 120.506 190 119.906 190 119.173L190 116.506C190 115.773 190.6 115.173 191.333 115.173L200.666 115.173C201.399 115.173 201.999 115.773 201.999 116.506L201.999 119.173C201.999 119.906 201.399 120.506 200.666 120.506ZM192.666 116.506C191.933 116.506 191.333 117.106 191.333 117.839C191.333 118.573 191.933 119.173 192.666 119.173C193.4 119.173 194 118.573 194 117.839C194 117.106 193.4 116.506 192.666 116.506Z" fill="#F0F2F5"> </path> </g> <defs> <clipPath id="clip0_360_41755"> <rect fill="white" height="220" width="220"> </rect> </clipPath> </defs> </svg>

    Annotators

    URL

    1. Briefing : Réunion des Parents d'Élèves de Première Générale au Lycée Louis Vincent

      Résumé

      • Cette note de synthèse résume les points clés de la réunion destinée aux parents des 308 élèves de première générale du lycée Louis Vincent.

      L'objectif était de présenter le déroulement de l'année, les enjeux du baccalauréat et l'importance de l'anticipation pour l'orientation post-bac.

      Les principaux points à retenir sont :

      • Une année charnière pour l'orientation : Bien que l'unique choix d'orientation de l'année consiste à abandonner l'une des trois spécialités pour la terminale, la classe de première est identifiée comme un moment crucial pour entamer la réflexion sur le projet post-bac et Parcoursup.
      • Rigueur sur l'assiduité : Une politique très stricte sera appliquée concernant les absences et les retards.

      En raison du poids du contrôle continu (40 % de la note finale du baccalauréat), l'assiduité est primordiale et l'établissement n'hésitera pas à contacter les familles de manière insistante pour garantir la présence des élèves.

      • Nouveauté au baccalauréat : Une épreuve anticipée de mathématiques, d'un coefficient 2 et se déroulant sans calculatrice, est introduite dès la fin de l'année de première.

      Elle s'ajoute à la traditionnelle épreuve anticipée de français.

      • Excellence académique et accompagnement : Le lycée affiche des résultats supérieurs à la moyenne académique, avec 98,44 % de réussite au baccalauréat général et 74 % de mentions.

      L'accompagnement pour Parcoursup est également un point fort, avec 100 % des élèves de terminale ayant reçu une proposition d'affectation l'année précédente.

      • Cadre scolaire et pédagogique : L'accent est mis sur le bien-être des élèves face à la pression scolaire, la nécessité d'un usage raisonné des outils numériques (téléphones, IA) et le maintien de méthodes de travail fondamentales comme la prise de notes manuscrite et la lecture.

      1. L'Année de Première : Une Année Stratégique

      L'année de première générale est présentée comme "relativement cool" en termes de décisions d'orientation immédiates, contrastant avec les choix de filières en seconde et les choix de formations supérieures en terminale. Cependant, son importance stratégique est fortement soulignée.

      Objectif Principal et Calendrier

      • Choix de Spécialités : Le seul choix d'orientation de l'année interviendra au troisième trimestre, lorsque les élèves devront indiquer laquelle de leurs trois spécialités ils souhaitent abandonner pour la classe de terminale.

      Calendrier de l'Orientation :

      Dès octobre : Des bilans de mi-trimestre seront organisés pour faire des points d'étape avec les équipes pédagogiques et proposer des rendez-vous pour affiner le projet de l'élève.

      Novembre : Participation à la Semaine Nationale de l'Orientation.

      Troisième trimestre : Accélération du processus avec le choix final de la spécialité à abandonner.

      L'Importance de l'Anticipation pour Parcoursup

      L'année de première est le moment idéal pour commencer à préparer les choix de l'enseignement supérieur.

      Il est rappelé qu'en terminale, le temps pour choisir parmi les 60 000 formations supérieures (dont 24 000 sur Parcoursup) est très court.

      Il est donc conseillé aux familles d'initier la discussion sur l'avenir, même si les élèves n'ont pas de projet précis.

      Une approche suggérée est d'identifier ce que les élèves ne veulent pas faire pour affiner progressivement leurs centres d'intérêt.

      2. Organisation et Vie Scolaire

      • Équipe Pédagogique et Administrative

      • L'encadrement des neuf classes de première (601 à 609) est assuré par une équipe de référents dédiés.

      Rôle

      Personnes en charge

      • Proviseur Adjoint
      • Proviseur
      • Professeurs Principaux
      • Un professeur principal par classe
      • CPE Référentes
      • Psychologues de l'Éducation Nationale
      • Disponibles sur rendez-vous pour affiner les choix d'orientation
      • Secrétariat Pédagogique
      • Mme x, pour les dossiers scolaires et les demandes d'aménagements
      • Règlement Intérieur et Discipline
      • Deux points du règlement intérieur sont particulièrement mis en avant.
      • Assiduité (Absences et Retards) :
      • ◦ Enjeu majeur : Le contrôle continu compte pour 40 % de la note du bac. Chaque note obtenue en cours est donc importante.
      • Politique stricte : L'établissement sera très vigilant, contactant les parents par téléphone, mail ou SMS ("on va vous harceler"). Le caractère justifié ou non des motifs d'absence sera évalué par l'administration et pourra figurer sur le bulletin, document essentiel pour Parcoursup.
      • Lutte contre les stratégies d'évitement : Les absences stratégiques lors des devoirs seront combattues.

      Un système de rattrapage des devoirs sera mis en place sur 24 samedis dans l'année.

      Si les notes ne sont pas représentatives du niveau de l'élève, une épreuve ponctuelle au baccalauréat pourra être imposée. * • Usage du Numérique : * ◦ Téléphones portables : Interdits dans tous les bâtiments, sauf autorisation explicite d'un adulte. La dépendance et la distraction causées par les notifications sont considérées comme des freins majeurs à l'apprentissage. * ◦ Intelligence Artificielle et Triche : L'équipe pédagogique est consciente des difficultés posées par des outils comme ChatGPT pour les travaux à la maison (rédactions).

      La copie entre élèves est également surveillée.

      L'accent est mis sur la nécessité d'un travail personnel.

      Santé et Bien-être des Élèves

      • L'année de première peut être une source d'angoisse pour les élèves en raison de la pression des notes, du baccalauréat et de l'orientation future.

      Les parents sont encouragés à contacter l'établissement (CPE, professeurs) s'ils observent un changement de comportement ou un mal-être chez leur enfant.

      Il est rappelé qu'un travail régulier est plus productif et moins anxiogène que des révisions de dernière minute.

      3. Le Baccalauréat : Modalités et Épreuves

      Structure et Coefficients

      • La note finale du baccalauréat est composée à 40 % du contrôle continu et à 60 % des épreuves terminales. Des ajustements de coefficients ont été annoncés.

      Épreuve

      Voie Générale - Coefficient * Moment * Contrôle Continu (ensemble des matières du tronc commun) * 40 % * Première et Terminale * Enseignement de Spécialité 1 * 16 * Terminale * Enseignement de Spécialité 2 * 16 * Terminale * Philosophie * 8 * Terminale * Grand Oral * 8 (anciennement 10) * Terminale * Épreuve Anticipée de Français (Écrit + Oral) * 5 * Première * Épreuve Anticipée de Mathématiques (Nouveau) * 2 * Première

      Les Épreuves Anticipées en Fin de Première * • Français : * ◦ Écrit : 4 heures (commentaire ou dissertation). * ◦ Oral : Basé sur les textes étudiés pendant l'année. * ◦ Préparation : La lecture des quatre œuvres au programme (et des lectures cursives) est indispensable. L'établissement organise un bac blanc écrit et un bac blanc oral. * • Mathématiques (Nouveauté) : * ◦ Format : Épreuve sur 20 points (6 points d'automatismes, 14 points sur le programme de l'année). * ◦ Contrainte majeure : La calculatrice est interdite. L'objectif est de redonner du sens au calcul et au raisonnement. * ◦ Sujets : Trois sujets distincts seront proposés (voie technologique, voie générale sans spécialité maths, voie générale avec spécialité maths). * ◦ Préparation : Une épreuve blanche sera organisée en avril ou mai.

      Résultats et Enjeux exemple au Lycée Louis Vincent

      • Taux de réussite : 98,44 % au bac général (session 2025).
      • Taux de mentions : 74 % des élèves ont obtenu une mention. Ces résultats, supérieurs aux attendus académiques, montrent que l'obtention du baccalauréat est à la portée des élèves qui travaillent régulièrement.

      Le véritable enjeu est donc de bien réussir son baccalauréat afin d'obtenir une mention.

      4. Parcoursup : Préparer l'Avenir dès la Première

      • Une Réflexion à Long Terme
      • Les bulletins de la classe de première ont une importance capitale dans le dossier Parcoursup.

      Une réflexion précoce permet aux élèves de se motiver et de cibler les matières dans lesquelles ils doivent obtenir de bons résultats pour accéder aux formations souhaitées.

      Des outils comme le site Parcoursup lui-même ou le site Suptracker (pour les statistiques d'admission) sont recommandés. * Un Accompagnement Efficace * L'année dernière, 100 % des élèves de terminale du lycée ont reçu une affectation via Parcoursup, témoignant de la qualité de l'accompagnement des équipes.

      Le système est défendu comme une opportunité pour les élèves de postuler à des formations diverses sans hiérarchiser leurs vœux initialement, ce qui ouvre le champ des possibles.

      5. Outils de Communication et Questions Pratiques

      Plateformes Numériques : * ◦ Mon Bureau Numérique (MBN) : Principal outil pour la communication par mail avec les enseignants (via EduConnect) et pour consulter le cahier de textes. * ◦ Pronote : Outil de référence pour l'emploi du temps et la consultation des notes. Il est synchronisé et accessible via MBN. * • La "Pause Numérique" : Une directive ministérielle prévoit de bloquer l'accès aux environnements numériques après 20h et le week-end.

      La Région Grand Est a suspendu cette mesure jusqu'en décembre 2023, notamment en raison de l'accès aux manuels scolaires numériques.

      L'avenir de cette mesure est incertain.

      • Accompagnement Personnalisé (AP) : En français et en mathématiques, les enseignants décident quels élèves doivent y assister en fonction des besoins.

      Si un élève est convoqué, le cours d'AP apparaît directement dans son emploi du temps sur Pronote.

      • Absences des Professeurs : Le lycée a réalisé sa rentrée avec un effectif complet, une situation favorable qui limite le risque de non-remplacement en cas d'absence, contrairement à d'autres académies.

      • Activités Sportives (UNSS) : Les compétitions ont lieu le mercredi après-midi. Les élèves participants sont excusés mais doivent rattraper les cours manqués.

      L'établissement obtient d'excellents résultats, participant régulièrement aux championnats de France.

      • Complexité des Emplois du Temps : La réforme et le système de spécialités génèrent une grande complexité, avec 12 à 14 emplois du temps différents au sein d'une même classe de 35 élèves.

      Les élèves sont invités à consulter Pronote chaque matin pour vérifier les éventuelles modifications (salles, absences).

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript from Rodriguez Gama et al. proposes several interesting conclusions based on different oligomerization properties of Death-Fold Domains (DFDs) in cells, their natural abundance, and supersaturation properties. These ideas are:<br /> (1) DFDs broadly store the cell's energy by remaining in a supersaturated state;<br /> (2) Cells are constantly in a vulnerable state that could lead to cell death;<br /> (3) The cell's lifespan depends on the supersaturation levels of certain DFDs.

      Overall, the evidence supporting these claims is not completely solid. Some concerns were noted.

      Strengths:

      Systematic analysis of DFD self-assembly and its relationship with protein abundance, supersaturation, cell longevity, and evolution.

      Weaknesses

      (1) On page 2, it is stated, "Nucleation barriers increase with the entropic cost of assembly. Assemblies with large barriers, therefore, tend to be more ordered than those without. Ordered assembly often manifests as long filaments in cells," as a way to explain the observed results that DFDs assemblies that transitioned discontinuously form fibrils, whereas those that transitioned continuously (low-to-high) formed spherical or amorphous puncta. It is unlikely to be able to differentiate between amorphous and structured puncta by conventional confocal microscopy. Some DFDs self-assemble into structured puncta formed by intertwined fibrils. Such fibril nets are more structured and thus should be associated with a higher entropic cost. Therefore, the results in Figure 1B do not seem to agree with the reasoning described.

      (2) Errors for the data shown in Figure 1B would have been very useful to determine whether the population differences between diffuse, punctate, and fibrillar for the continuous (low-to-high) transition are meaningful.

      (3) A main concern in the data shown in Figure 1B and F is that the number of counts for discontinuous compared to continuous is small. Thus, the significance of the results is difficult to evaluate in the context of the broad function of DFDs as batteries, as stated at the beginning of the manuscript.

      (4) The proteins or domains that are self-seeded (Figure 1F) should be listed such that the reader has a better understanding of whether domains or full-length proteins are considered, whether other domains have an effect on self-seeding (which is not discussed), and whether there is repetition.

      (5) The authors indicate an anticorrelation between transcript abundance and Csat based on the data shown in Figure 2B; however, the data are scattered. It is not clear why an anticorrelation is inferred.

      (6) It would be useful to indicate the expected range of degree centrality. The differences observed are very small. This is specifically the case for the BC values. The lack of context and the small differences cast doubts on their significance. It would be beneficial to describe these data in the context of the centrality values of other proteins.

      (7) Page 3 section title: "Nucleation barriers are a characteristic feature of inflammatory signalosome adaptors." This title seems to contradict the results shown in Figure 2D, where full-length CARD9 and CARD11 are classified as sensors, but it has been reported that they are adaptor proteins with key roles in the inflammatory response. Please see the following references as examples: The adaptor protein CARD9 is essential for the activation of myeloid cells through ITAM-associated and Toll-like receptors. Nat Immunol 8, 619-629 (2007), and Mechanisms of Regulated and Dysregulated CARD11 Signaling in Adaptive Immunity and Disease. Front Immunol. 2018 Sep 19;9:2105.

      However, both CARD9 and CARD11 show discontinuous to continuous behavior for the individual DFDs versus full-length proteins, respectively, in contrast to the results obtained for ASC, FADD, etc. FADD plays a key role in apoptosis but shows the same behavior as BCL10 and ASC. However, the manuscript indicates that this behavior is characteristic of inflammatory signalosomes. What is the explanation for adaptor proteins behaving in different ways? This casts doubts about the possibility of deriving general conclusions on the significance of these observations, or the subtitles in the results section seem to be oversimplifications.

      (8) IFI16-PYD displays discontinuous behavior according to Figure S1H; however, it is not included in Figure 2D, but AIM 2 is.

      (9) To demonstrate that "Nucleation barriers facilitate signal amplification in human cells," constructs using APAF1 CARD, NLRC4 CARD, caspase-9 CARD, and a chimera of the latter are used to create what the authors refer to as apoptsomes. Even though puncta are observed, referring to these assemblies as apoptosomes seems somewhat misleading. In addition, it is not clear why the activity of caspase-9 was not measured directly, instead of that of capsae-3 and 7, which could be activated by other means. The polymerization of caspase-1 CARD with NLRC4 CARD, leading to irreversible puncta, could just mean that the polymers are more stable. In fact, not all DFDs form equally stable or identical complexes, which does not necessarily imply that a nucleation barrier facilitates signal amplification. Could this conclusion be an overstatement?

      (10) To demonstrate that "Innate immune adaptors are endogenously supersaturated," it is stated on page 5 that ASC clusters continue to grow for the full duration of the time course and that AIM2-PYD stops growing after 5 min. The data shown in Figure 4F indicate that AIM2-PYD grows after 5 mins, although slowly, and ASC starts to slow down at ~ 13 min. Because ASC has two DFDs, assemblies can grow faster and become bigger. How is this related to supersaturation?

    1. (1) transitioning among identities, (2) balancing identity continuity and change, (3) personal identity development through time and (4) personal and stable identity”

      *** (The Multifaceted Ideas)

    1. Tiene muchas actualizaciones

      Update of Convalescent plasma or hyperimmune immunoglobulin for people with COVID-19: a rapid review. Valk SJ, Piechotta V, Chai KL, Doree C, Monsef I, Wood EM, Lamikanra A, Kimber C, McQuilten Z, So-Osman C, Estcourt LJ, Skoetz N. Cochrane Database Syst Rev. 2020 May 14;5(5):CD013600. doi: 10.1002/14651858.CD013600. Update in: Cochrane Database Syst Rev. 2020 Jul 10;7:CD013600. doi: 10.1002/14651858.CD013600.pub2. PMID: 32406927 Free PMC article. 189 2 189 0 Similar articles Convalescent plasma or hyperimmune immunoglobulin for people with COVID-19: a living systematic review. Chai KL, Valk SJ, Piechotta V, Kimber C, Monsef I, Doree C, Wood EM, Lamikanra AA, Roberts DJ, McQuilten Z, So-Osman C, Estcourt LJ, Skoetz N. Cochrane Database Syst Rev. 2020 Oct 12;10:CD013600. doi: 10.1002/14651858.CD013600.pub3. Update in: Cochrane Database Syst Rev. 2021 May 20;5:CD013600. doi: 10.1002/14651858.CD013600.pub4. PMID: 33044747 158 7 56 0 Convalescent plasma or hyperimmune immunoglobulin for people with COVID-19: a living systematic review. Piechotta V, Iannizzi C, Chai KL, Valk SJ, Kimber C, Dorando E, Monsef I, Wood EM, Lamikanra AA, Roberts DJ, McQuilten Z, So-Osman C, Estcourt LJ, Skoetz N. Cochrane Database Syst Rev. 2021 May 20;5(5):CD013600. doi: 10.1002/14651858.CD013600.pub4. Update in: Cochrane Database Syst Rev. 2023 Feb 1;2:CD013600. doi: 10.1002/14651858.CD013600.pub5. PMID: 34013969 Free PMC article. 193 4 81 0 Convalescent plasma or hyperimmune immunoglobulin for people with COVID-19: a rapid review. Valk SJ, Piechotta V, Chai KL, Doree C, Monsef I, Wood EM, Lamikanra A, Kimber C, McQuilten Z, So-Osman C, Estcourt LJ, Skoetz N. Cochrane Database Syst Rev. 2020 May 14;5(5):CD013600. doi: 10.1002/14651858.CD013600. Update in: Cochrane Database Syst Rev. 2020 Jul 10;7:CD013600. doi: 10.1002/14651858.CD013600.pub2. PMID: 32406927 Free PMC article. 189 2 189 0 Vitamin D supplementation for the treatment of COVID-19: a living systematic review. Stroehlein JK, Wallqvist J, Iannizzi C, Mikolajewska A, Metzendorf MI, Benstoem C, Meybohm P, Becker M, Skoetz N, Stegemann M, Piechotta V. Cochrane Database Syst Rev. 2021 May 24;5(5):CD015043. doi: 10.1002/14651858.CD015043. PMID: 34029377 Free PMC article. 116 2 88 2 Remdesivir for the treatment of COVID-19. Ansems K, Grundeis F, Dahms K, Mikolajewska A, Thieme V, Piechotta V, Metzendorf MI, Stegemann M, Benstoem C, Fichtner F. Cochrane Database Syst Rev. 2021 Aug 5;8(8):CD014962. doi: 10.1002/14651858.CD014962. Update in: Cochrane Database Syst Rev. 2023 Jan 25;1:CD014962. doi: 10.1002/14651858.CD014962.pub2. PMID: 34350582 Free PMC article. 209 0 97 2

      1. The Cornell system is a lot more organized than the list method of taking notes.
      2. Benefits: typing is faster and easier than writing to many people. Problems: Taking notes by hand is proven to be more effective for the memory, and you can be tempted to take notes of everything the professor says because you can keep up with what they are saying.
      3. Ask to see another classmate's notes and make them your own (do not copy them as your classmate wrote them), ask the professor what you missed in class that day, and if the professor records their lecture, listen to it.
    1. fibrocartilaginous

      medical term describing something as composed of fibrocartilage, a type of tissue containing both fibrous and cartilaginous components, often found between tendons, ligaments, or bones.

    1. Feuille de Route Stratégique pour l'Année de Terminale : Un Modèle pour la Réussite

      1.0 Introduction : Les Enjeux Stratégiques de l'Année de Terminale

      • L'année de Terminale constitue une année charnière, un véritable pivot dans le parcours d'un élève, marquée par un double enjeu stratégique.

      D'une part, elle représente l'aboutissement des années lycée avec l'objectif tangible d'obtenir le baccalauréat.

      D'autre part, elle est le théâtre d'une préparation active et décisive de l'avenir, matérialisée par les choix d'orientation vers l'enseignement supérieur via la plateforme Parcoursup.

      Cette feuille de route est conçue comme un guide généraliste, synthétisant les meilleures pratiques et les informations clés pour structurer l'accompagnement des élèves et de leurs familles tout au long de cette année dense et déterminante.

      Le premier pilier de cette réussite est la maîtrise du cadre d'évaluation du baccalauréat.

      2.0 Le Baccalauréat : Structure, Préparation et Évaluation

      • La réussite au baccalauréat repose sur une compréhension claire de sa structure d'évaluation et sur une préparation méthodique tout au long de l'année.

      Il ne s'agit plus seulement de viser un succès lors des examens finaux, mais de construire sa réussite de manière continue.

      La performance globale de l'élève est le fruit d'un équilibre entre le travail régulier, validé par le contrôle continu, et la capacité à se mobiliser pour les épreuves terminales.

      2.1 Analyse de la Structure d'Évaluation

      • La note finale du baccalauréat général est une somme pondérée qui reflète à la fois le parcours de l'élève et sa performance lors des examens finaux.

      La répartition des coefficients est la suivante : 40 % pour le contrôle continu et 60 % pour les épreuves terminales.

      • Les épreuves terminales constituent le poids le plus important de la note finale. Leur structure en voie générale se décompose comme suit :

      Épreuve Terminale Coefficient en Voie Générale * Philosophie= 8 * Enseignement de Spécialité 1= 16 * Enseignement de Spécialité 2= 16 * Grand Oral= 10

      À ces épreuves s'ajoutent les notes des épreuves anticipées de français (écrit, coefficient 5, et oral, coefficient 5), passées en fin de Première, qui complètent la note des épreuves terminales pour atteindre le total de 60%.

      • Le contrôle continu, quant à lui, est basé sur les moyennes annuelles de l'ensemble des disciplines du cycle terminal (Première et Terminale), qui sont officiellement validées lors des conseils de classe de fin d'année.

      2.2 Le Grand Oral : Un Levier de Réussite à Fort Potentiel

      • L'une des épreuves les plus stratégiques du nouveau baccalauréat est sans conteste le Grand Oral.

      Avec son coefficient élevé de 10 en voie générale, il représente un levier majeur pour la note finale.

      Il s'agit d'une épreuve à "rendement de notes" particulièrement intéressant ;

      il est en effet plus courant pour un élève bien préparé d'y obtenir une note maximale que dans certaines disciplines écrites traditionnelles.

      Les compétences évaluées lors du Grand Oral sont fondamentales :

      • • Apprendre à s'exprimer en public de manière claire et convaincante.
      • • Démontrer des capacités d'argumentation et un esprit critique.
      • • Faire preuve de clarté dans son expression et de maîtrise de ses connaissances.
      • L'épreuve se déroule en 20 minutes face à un jury. L'élève prépare en amont deux questions en lien avec ses enseignements de spécialité.

      Le jour de l'épreuve, il dispose de 10 minutes pour présenter sa réponse à l'une des questions, suivies de 10 minutes d'échange et d'approfondissement avec le jury.

      Au-delà de son poids dans l'examen, cette épreuve est essentielle car elle développe des compétences oratoires cruciales pour la poursuite d'études (entretiens d'admission) et pour l'ensemble de la vie professionnelle.

      2.3 Garantir l'Intégrité du Contrôle Continu : Stratégies et Cadre

      • Pour assurer un contrôle continu juste et représentatif du niveau réel des élèves, les établissements mettent en place des mécanismes de régulation précis.

      Chaque lycée dispose d'un "projet d'évaluation" qui vise à garantir une égalité de traitement entre tous les candidats.

      • Face à la tentation de l'absentéisme stratégique (éviter un devoir après avoir obtenu une bonne note pour préserver sa moyenne), une politique claire est appliquée.

      L'établissement offre systématiquement la possibilité de rattraper une évaluation manquée, souvent lors de sessions organisées le samedi matin.

      Cette mesure a pour but de contrecarrer ces tactiques et d'assurer que la moyenne reflète un travail régulier.

      • Si, malgré tout, les notes d'un élève ne sont pas jugées "robustes" – c'est-à-dire non représentatives de son niveau réel en raison d'un nombre insuffisant d'évaluations –, l'établissement peut organiser une épreuve ponctuelle individuelle pour valider ses compétences.

      2.4 Un Calendrier de Préparation Structuré

      • La préparation aux épreuves finales est rythmée par des dispositifs d'entraînement organisés par l'établissement tout au long de l'année.

      Ces moments sont cruciaux pour familiariser les élèves avec les conditions d'examen.

      • Devoirs communs : Des sessions d'évaluation sont organisées, fréquemment le samedi matin, pour simuler les conditions réelles des épreuves écrites (durée, format, environnement).
      • Épreuves blanches : Des examens blancs complets sont mis en place, incluant des journées banalisées (par exemple en avril) pour les épreuves de spécialité, permettant une immersion totale.

      • Oral blanc : Un entraînement spécifique au Grand Oral est organisé. Il permet aux élèves de se tester et de bénéficier de retours constructifs de la part des évaluateurs pour affiner leur prestation.

      • S'exercer en conditions réelles est indispensable. Cela permet aux élèves d'apprendre à gérer leur temps sur une épreuve de 4 heures, à maîtriser leur stress dans l'environnement d'une grande salle d'examen, et à se familiariser avec le format officiel.

      Cette préparation met en confiance et réduit l'imprévu le jour J.

      • L'obtention du baccalauréat est la première étape vers la réussite.

      La seconde, tout aussi cruciale, consiste à préparer activement son avenir et à concrétiser son projet d'orientation via Parcoursup.

      3.0 Parcoursup : Naviguer Stratégiquement vers l'Enseignement Supérieur

      • Parcoursup est l'outil central et incontournable de l'orientation post-bac en France.

      Loin d'être une simple plateforme d'inscription, son utilisation efficace requiert de l'anticipation, une recherche approfondie et une stratégie réfléchie.

      Une démarche bien menée permet aux élèves de choisir leur avenir plutôt que de le subir, en alignant leurs aspirations avec les réalités et les attendus de l'enseignement supérieur.

      3.1 Le Calendrier en Trois Étapes Clés

      • Le processus Parcoursup se déroule selon un calendrier national précis, articulé en trois grandes phases.

      • 1. Phase 1 (Décembre - Janvier) : Découverte des Formations. Le site Parcoursup ouvre pour consultation.

      À partir de la mi-décembre, les informations sur les milliers de formations disponibles sont mises à jour pour la rentrée suivante. C'est la période de recherche, d'exploration et de première sélection.

      • 2. Phase 2 (Mi-Janvier - Début Avril) : Inscription et Formulation des Vœux. Cette phase est consacrée à la création du dossier de candidature et à la formulation des vœux (et sous-vœux).

      Une date butoir est fixée mi-mars pour ajouter des vœux, et une seconde début avril pour finaliser chaque dossier avec les éléments requis (projets de formation motivés, etc.).

      • 3. Phase 3 (Début Juin - Début Juillet) : Réception et Gestion des Propositions. C'est la phase des résultats.

      Les élèves reçoivent les réponses des formations et doivent gérer les propositions d'admission en y répondant dans les délais impartis.

      3.2 Le Dossier : Un Portrait Complet du Candidat

      • Le dossier Parcoursup est bien plus qu'un simple relevé de notes.

      Les commissions d'examen, composées d'équipes pédagogiques et non d'intelligences artificielles, analysent les dossiers pour identifier des profils d'élèves investis, sérieux et motivés.

      Les appréciations des professeurs sur les bulletins scolaires sont d'une importance capitale.

      Elles fournissent un contexte qualitatif aux notes et donnent des indications précieuses sur le sérieux de l'élève, son implication en classe, sa progression et son potentiel de réussite dans le supérieur.

      Il est crucial de noter qu'obtenir des appréciations positives est à la portée de tout élève : l'implication, l'attention en classe et la démonstration d'efforts sont des qualités que les professeurs valorisent et signalent systématiquement.

      À l'inverse, les absences, les retards et les remarques sur le comportement sont des facteurs rédhibitoires pour de nombreuses formations.

      Comme l'expliquent les évaluateurs, face à des milliers de dossiers, un bulletin affichant "régulièrement absent" est souvent immédiatement mis de côté.

      Entre deux candidats aux résultats similaires, le choix se portera toujours sur celui qui a démontré son assiduité et son sérieux.

      3.3 Outils d'Orientation et d'Aide à la Décision

      • Plusieurs ressources sont à la disposition des élèves et de leurs familles pour éclairer leurs choix et construire un projet solide.
      • Les PsyEN (Psychologues de l'Éducation Nationale) : Il est vivement recommandé de prendre rendez-vous avec un PsyEN dès le premier trimestre.

      Leurs plannings se saturent rapidement, et une consultation précoce permet d'engager une réflexion accompagnée avant les échéances de Parcoursup.

      • Les Salons et Forums : Des événements comme le salon "Réaction", qui présente près de 5000 formations, ou le forum des formations organisé au sein même du lycée, sont des moments clés.

      Ils permettent de rencontrer des représentants d'écoles, des étudiants et des professionnels pour poser des questions concrètes.

      • SupTracker : Cet outil en ligne est indispensable pour consulter des statistiques détaillées sur les profils des candidats admis dans chaque formation les années précédentes.

      Par exemple, il permet de voir quelles combinaisons de spécialités sont les plus représentées parmi les admis en PASS (parcours d'accès spécifique santé), aidant ainsi l'élève à évaluer la cohérence de son profil avec les formations visées.

      • • Parcoursup : La plateforme elle-même est une mine d'informations.

      Chaque fiche de formation détaille les "attendus" (compétences et connaissances requises), les critères d'analyse des candidatures, et souvent les statistiques de l'année précédente, comme la moyenne du dernier admis.

      3.4 Élaborer une Stratégie de Vœux Intelligente

      • Une stratégie de vœux réussie est une stratégie anticipée et bien construite.

      • 1. Commencer la réflexion tôt : Il est impératif de ne pas attendre les dernières semaines pour réfléchir à son orientation. La recherche doit commencer dès le début de l'année pour éviter des décisions prises dans l'urgence.

      • 2. Diversifier et sécuriser ses vœux : Il est stratégique de formuler un nombre suffisant de vœux pour couvrir plusieurs scénarios, des plus ambitieux aux plus sécurisés.

      Après la date butoir de mi-mars, aucun ajout n'est possible. Il est donc préférable d'inclure des formations "de sécurité" ou des alternatives en cas d'incertitude, plutôt que de risquer de limiter ses options.

      • 3. Faire des choix cohérents : Les vœux doivent être en adéquation avec le profil académique de l'élève (spécialités suivies, résultats scolaires, compétences). Consulter les statistiques sur Parcoursup et SupTracker permet d'ajuster sa stratégie.

      • 4. Comprendre les listes d'attente : Il ne faut pas se décourager face à une position lointaine sur une liste d'attente.

      Celles-ci évoluent très rapidement. Consulter le rang du dernier admis de l'année précédente donne une indication précieuse, bien que non garantie, sur ses chances d'être finalement accepté.

      • Le succès de ce parcours complexe ne repose pas uniquement sur l'élève, mais sur la mobilisation de tout l'écosystème qui l'entoure.

      4.0 L'Écosystème de la Réussite : Le Rôle des Équipes, des Élèves et des Familles

      • La réussite en Terminale est une entreprise collective.

      Elle dépend d'une collaboration étroite et d'un engagement partagé entre l'établissement scolaire, qui fournit le cadre et l'accompagnement, l'élève, qui est l'acteur principal de son parcours, et sa famille, qui offre un soutien indispensable.

      4.1 L'Accompagnement par l'Équipe Éducative

      • Au sein du lycée, un réseau de soutien est spécifiquement structuré pour accompagner les élèves de Terminale.

      Le système de double professeur principal par classe est une des clés de cet accompagnement. Généralement, l'un est désigné "professeur référent", plus spécifiquement axé sur les questions d'orientation et Parcoursup, tandis que l'autre se concentre davantage sur la gestion de la vie de la classe et le suivi scolaire global.

      • Ces professeurs sont les premiers interlocuteurs des élèves et des familles.

      Ils sont les mieux placés pour répondre aux questions, conseiller et orienter tout au long de l'année. À leurs côtés, les CPE (Conseillers Principaux d'Éducation) et l'équipe de direction assurent un suivi global et interviennent en appui.

      4.2 L'Engagement Indispensable des Familles

      • Le rôle des parents est crucial pour accompagner l'élève avec bienveillance et efficacité. Voici quelques conseils pratiques pour les familles :

      • Engager le dialogue sur l'orientation : Il est souvent plus productif d'aborder le sujet non seulement par la question "Que veux-tu faire plus tard ?" mais aussi par son inverse : "Qu'est-ce que tu ne veux absolument pas faire ?".

      Cette approche par élimination permet de cerner plus facilement les centres d'intérêt et les rejets.

      • Accompagner sans décider à la place : Les parents peuvent jouer un rôle de facilitateur en se familiarisant avec les outils comme Parcoursup et SupTracker.

      Ils peuvent ainsi aider leur enfant dans ses recherches, discuter des options et l'aider à structurer sa réflexion, sans imposer leurs propres choix.

      • Assurer un suivi de l'assiduité : Une vigilance particulière est recommandée concernant les absences, notamment les jours de devoir.

      Il est important de dialoguer avec son enfant pour comprendre les raisons d'une éventuelle démotivation et de soutenir les politiques de l'établissement visant à garantir un travail régulier.

      • Anticiper la tension des résultats : La première semaine de juin, lors de la publication des réponses de Parcoursup, est une période de stress intense et inévitable.

      Les listes d'attente peuvent être source d'angoisse. Les parents doivent se préparer à cette phase, comprendre que c'est une étape normale du processus, et être prêts à soutenir leur enfant dans un contexte qui, comme le souligne l'équipe du lycée, peut rendre "l'ambiance en famille... chaude".

      5.0 Conclusion : Une Année de Préparation et de Transition

      En définitive, l'année de Terminale doit être envisagée moins comme une fin en soi que comme une transition activement préparée vers l'avenir.

      Le double objectif de l'obtention du baccalauréat et de la réussite de son orientation est parfaitement atteignable lorsque la démarche est structurée.

      Une compréhension claire des attentes, une communication ouverte entre l'élève, sa famille et l'équipe éducative, ainsi qu'un travail régulier et une préparation sérieuse sont les garants d'une double réussite : un diplôme obtenu avec succès et une orientation post-bac choisie, ambitieuse et épanouissante.

    1. Reading list: https://docs.google.com/spreadsheets/d/1lCufgJO4WJJpO6EUpGggeWdz9UnAahGbwDL_IEKfYAU/edit?gid=0#gid=0

      Date Section <br /> 9/16/25 What is Life? Preface, Chapter 1<br /> 9/23 Chapter 2<br /> 9/30 Chapter 3<br /> 10/7 Chapter 4<br /> 10/14 Chapter 5<br /> 10/21 Chapter 6<br /> 10/28 Chapter 7<br /> 11/4 Epilogue<br /> 11/11 Mind and Matter Chapter 1<br /> 11/18 Chapter 2<br /> 11/25 BREAK<br /> 12/2 Chapter 3 + 4<br /> 12/9 Chapter 5<br /> 12/16 Chapter 6

    1. Author Response :

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This work shows that a specific adenosine deaminase protein in Dictyostelium generates the ammonia that is required for tip formation during Dictyostelium development. Cells with an insertion in the ADGF gene aggregate but do not form tips. A remarkable result, shown in several different ways, is that the ADGF mutant can be rescued by exposing the mutant to ammonia gas. The authors also describe other phenotypes of the ADGF mutant such as increased mound size, altered cAMP signalling, and abnormal cell type differentiation. It appears that the ADGF mutant has defects in the expression of a large number of genes, resulting in not only the tip defect but also the mound size, cAMP signalling, and differentiation phenotypes.

      Strengths:

      The data and statistics are excellent.

      Weaknesses

      (1) The key weakness is understanding why the cells bother to use a diffusible gas like ammonia as a signal to form a tip and continue development.

      Ammonia can come from a variety of sources both within and outside the cells and this can be from dead cells also. Ammonia by increasing cAMP levels, trigger collective cell movement thereby establishing a tip in Dictyostelium. A gaseous signal can act over long distances in a short time and for instance ammonia promotes synchronous development in a colony of yeast cells (Palkova et al., 1997; Palkova and Forstova, 2000). The slug tip is known to release ammonia probably favouring synchronized development of the entire colony of Dictyostelium. However, after the tips are established ammonia exerts negative chemotaxis probably helping the slugs to move away from each other ensuring equal spacing of the fruiting bodies (Feit and Sollitto, 1987).

      It is well known that ammonia serves as a signalling molecule influencing both multicellular organization and differentiation in Dictyostelium (Francis, 1964; Bonner et al., 1989; Bradbury and Gross, 1989). Ammonia by raising the pH of the intracellular acidic vesicles of prestalk cells (Poole and Ohkuma, 1981; Gross et al, 1983), and the cytoplasm, is known to increase the speed of chemotaxing amoebae (Siegert and Weijer, 1989; Van Duijn and Inouye, 1991), inducing collective cell movement (Bonner et al., 1988, 1989), favoring tipped mound development.

      Ammonia produced in millimolar concentrations during tip formation (Schindler and Sussman, 1977) could ward off other predators in soil. For instance, ammonia released by Streptomyces symbionts of leaf-cutting ants is known to inhibit fungal pathogens (Dhodary and Spiteller, 2021). Additionally, ammonia may be recycled back into amino acids, as observed during breast cancer proliferation (Spinelli et al., 2017). Such a process may also occur in starving Dictyostelium cells, supporting survival and differentiation. These findings suggest that ammonia acts as both a local and long-range regulatory signal, integrating environmental and cellular cues to coordinate multicellular development.

      (2) The rescue of the mutant by adding ammonia gas to the entire culture indicates that ammonia conveys no positional information within the mound.

      Ammonia reinforces or maintains the positional information by elevating cAMP levels, favoring prespore differentiation (Bradbury and Gross, 1989; Riley and Barclay, 1990; Hopper et al., 1993). Ammonia is known to influence rapid patterning of Dictyostelium cells confined in a restricted environment (Sawai et al., 2002). In adgf mutants that have low ammonia levels, both neutral red staining (a marker for prestalk and ALCs) (Figure. S3) and the prestalk marker ecmA/ ecmB expression (Figure. 7D) are higher than the WT and the mound arrest phenotype can be reversed by exposing the adgf mutant mounds to ammonia.

      Prestalk cells are enriched in acidic vesicles, and ammonia, by raising the pH of these vesicles and the cytoplasm (Davies et al 1993; Van Duijn and Inouye 1991), plays an active role in collective cell movement during tip formation (Bonner et al., 1989).

      (3) By the time the cells have formed a mound, the cells have been starving for several hours, and desperately need to form a fruiting body to disperse some of themselves as spores, and thus need to form a tip no matter what.

      Exposure of adgf mounds to ammonia, led to tip development within 4 h (Figure. 5). In contrast, adgf controls remained at the mound stage for at least 30 h. This demonstrates that starvation alone is not the trigger for tip development and ammonia promotes the transition from mound to tipped mound formation.

      Many mound arrest mutants are blocked in development and do not proceed to form fruiting bodies (Carrin et al., 1994). Further, not all the mound arrest mutants tested in this study were rescued by ADA enzyme (Figure. S4A), and they continue to stay as mounds.

      (4) One can envision that the local ammonia concentration is possibly informing the mound that some minimal number of cells are present (assuming that the ammonia concentration is proportional to the number of cells), but probably even a minuscule fruiting body would be preferable to the cells compared to a mound. This latter idea could be easily explored by examining the fate of the ADGF cells in the mound - do they all form spores? Do some form spores?

      Or perhaps the ADGF is secreted by only one cell type, and the resulting ammonia tells the mound that for some reason that cell type is not present in the mound, allowing some of the cells to transdifferentiate into the needed cell type. Thus, elucidating if all or some cells produce ADGF would greatly strengthen this puzzling story.

      A fraction of adgf mounds form bulkier spore heads by the end of 36 h as shown in Figure. 2H. This late recovery may be due to the expression of other ADA isoforms. Mixing WT and adgf mutant cell lines results in a chimeric slug with mutants occupying the prestalk region (Figure. 8) and suggests that WT ADGF favours prespore differentiation. However, it is not clear if ADGF is secreted by a particular cell type, as adenosine can be produced by both cell types, and the activity of three other intracellular ADAs may vary between the cell types. To address whether adgf expression is cell type-specific, prestalk and prespore cells will be separated by fluorescence activated cell sorter (FACS), and thereafter, adgf expression will be examined in each population.

      Reviewer #1 (Recommendations for the authors):

      (1) Lines: 47,48 - "The gradient of these morphogens along the slug axis determines the cell fate, either as prestalk (pst) or as prespore (psp) cells." - many workers have shown that this is not true - intrinsic factors such as cell cycle phase drive cell fate.

      Thank you for pointing this out. We have removed the line and rephrased as “Based on cell cycle phases, there exists a dichotomy of cell types, that biases cell fate as prestalk or prespore (Weeks and Weijer, 1994; Jang and Gomer, 2011).

      (2) Line 48 - PKA - please explain acronyms at first use.

      Corrected

      (3) Line 56 - The relationship between adenosine deaminase and ADGF is a bit unclear, please clarify this more.

      Adenosine deaminase (ADA) is intracellular, whereas adenosine deaminase related growth factor (ADGF) is an extracellular ADA and has a growth factor activity (Li and Aksoy, 2000; Iijima et al., 2008).

      (4) Figure 1 - where are these primers, and the bsr cassette, located with respect to the coding region start and stop sites?

      The primer sequences are mentioned in the supplementary table S2. The figure legend is updated to provide a detailed description.

      (5) Line 104 - 37.47% may be too many significant figures.

      Corrected

      (6) Line 123 - 1.003 Å may be too many significant figures.

      Corrected

      (7) Line 128 - Since the data are in the figure, you don't need to give the numbers, also too many significant figures.

      Corrected

      (8) Figure 3G - did the DCF also increase mound size? It sort of looks like it did.

      Yes, the addition of DCF increases the mound size (now Figure. 2G).

      (9) Figure 3I - the spore mass shown here for ADGF - looks like there are 3 stalks protruding from it; this can happen if a plate is handled roughly and the spore masses bang into each other and then merge

      Thank you for pointing this out. The figure 3I (now Figure. 2I) is replaced.

      (10) Lines 160-162 - since the data are in the figure, you don't need to give the numbers, also too many significant figures.

      Corrected.

      (11) Line 165 - ' ... that are involved in adenosine formation' needs a reference.

      Reference is included.

      (12) Line 205 - 'Addition of ADA to the CM of the mutant in one compartment.' - might clarify that the mutant is the ADGF mutant

      Yes, revised to 'Addition of ADA to the CM of the adgf mutant in one compartment.

      (13 Lines 222-223 need a reference for caffeine acting as an adenosine antagonist.

      Reference is included.

      (14) Figure 8B - left - use a 0-4 or so scale so the bars are more visible.

      Thank you for the suggestion. The scale of the y-axis is adjusted to 0-4 in Figure. 7B to enhance the visibility of the bars.

      Reviewer #2 (Public review):

      Summary:

      The paper describes new insights into the role of adenosine deaminase-related growth factor (ADGF), an enzyme that catalyses the breakdown of adenosine into ammonia and inosine, in tip formation during Dictyostelium development. The ADGF null mutant has a pre-tip mound arrest phenotype, which can be rescued by the external addition of ammonia. Analysis suggests that the phenotype involves changes in cAMP signalling possibly involving a histidine kinase dhkD, but details remain to be resolved.

      Strengths:

      The generation of an ADGF mutant showed a strong mound arrest phenotype and successful rescue by external ammonia. Characterization of significant changes in cAMP signalling components, suggesting low cAMP signalling in the mutant and identification of the histidine kinase dhkD as a possible component of the transduction pathway. Identification of a change in cell type differentiation towards prestalk fate

      (1) Weaknesses: Lack of details on the developmental time course of ADGF activity and cell type type-specific differences in ADGF expression.

      adgf expression was examined at 0, 8, 12, and 16 h (Figure. 1), and the total ADA activity was assayed at 12 and 16 h (Figure. 3). Previously, the 12 h data was not included, and it’s been added now (Figure. 3A). The adgf expression was found to be highest at 16 h and hence, the ADA assay was carried out at that time point. Since the ADA assay will also report the activity of other three isoforms, it will not exclusively reflect ADGF activity.

      Mixing WT and adgf mutant cell lines results in a chimeric slug with mutants occupying the prestalk region (Figure. 8) suggesting that WT adgf favours prespore differentiation. To address whether adgf expression is cell type-specific, prestalk and prespore cells will be separated by fluorescence activated cell sorter (FACS), and thereafter, adgf expression will be examined in each population.

      (2) The absence of measurements to show that ammonia addition to the null mutant can rescue the proposed defects in cAMP signalling.

      The adgf mutant in comparison to WT has diminished acaA expression (Fig. 6B) and reduced cAMP levels (Fig. 6A) both at 12 and 16 h of development. The cAMP levels were measured at 8 h and 12 h in the mutant.

      We would like to add that ammonia is known to increase cAMP levels (Riley and Barclay, 1990; Feit et al., 2001) in Dictyostelium. Exposure to ammonia increases acaA expression in WT (Figure. 7B) and is likely to increase acaA expression/ cAMP levels in the mutant also (Riley and Barclay, 1990; Feit et al., 2001) thereby rescuing the defects in cAMP signalling. Based on the comments, cAMP levels will also be measured in the mutant after the rescue with ammonia.

      (3) No direct measurements in the dhkD mutant to show that it acts upstream of adgf in the control of changes in cAMP signalling and tip formation.

      cAMP levels will be quantified in the dhkD mutant after treatment with ammonia. The histidine kinases dhkD and dhkC are reported to modulate phosphodiesterase RegA activity, thereby maintaining cAMP levels (Singleton et al., 1998; Singleton and Xiong, 2013). By activating RegA, dhkD ensures proper cAMP distribution within the mound, which is essential for the patterning of prestalk and prespore cells, as well as for tip formation (Singleton and Xiong, 2013). Therefore, ammonia exposure to dhkD mutants is likely to regulate cAMP signalling and thereby tip formation.

      Reviewer #2 (Recommendations for the authors):

      The paper describes new insights into the role of ADGF, an enzyme that catalyses the breakdown of adenosine in ammonia and inosine, in tip formation in Dictyostelium development.

      A knockout of the gene results in a tipless mound stage arrest and the mounds formed are somewhat larger in size. Synergy experiments show that the effect of the mutation is non-cell autonomous and further experiments show that the mound arrest phenotype can be rescued by the provision of ammonia vapour. These observations are well documented. Furthermore, the paper contains a wide variety of experiments attempting to place the observed effects in known signalling pathways. It is suggested that ADGF may function downstream of DhkD, a histidine kinase previously implicated in ammonia signalling. Ammonia has long been described to affect different aspects, including differentiation of slug and culmination stages of Dictyostelium development, possibly through modulating cAMP signalling, but the exact mechanisms of action have not yet been resolved. The experiments reported here to resolve the mechanistic basis of the mutant phenotype need focusing and further work.

      (1) The paper needs streamlining and editing to concentrate on the main findings and implications.

      The manuscript will be revised extensively.

      Below is a list of some more specific comments and suggestions.

      (2) Introduction: Focus on what is relevant to understanding tip formation and the role of nucleotide metabolism and ammonia (see https://doi.org/10.1016/j.gde.2016.05.014).leading). This could lead to the rationale for investigating ADGF.

      The manuscript will be revised extensively

      (3) Lines 36-38 are not relevant. Lines 55-63 need shortening and to focus on ADGF, cellular localization, and substrate specificity.

      The manuscript will be revised accordingly. Lines 36-38 will be removed, and the lines 55-63 will be shortened.

      In humans, two isoforms of ADA are known including ADA1 and ADA2, and the Dictyostelium homolog of ADA2 is adenosine deaminase-related growth factor (ADGF). Unlike ADA that is intracellular, ADGF is extracellular and also has a growth factor activity (Li and Aksoy, 2000; Iijima et al., 2008). Loss-of-function mutations in ada2 are linked to lymphopenia, severe combined immunodeficiency (SCID) (Gaspar, 2010), and vascular inflammation due to accumulation of toxic metabolites like dATP (Notarangelo, 2016; Zhou et al., 2014).

      (4) Results: This section would benefit from better streamlining by a separation of results that provide more mechanistic insight from more peripheral observations.

      The manuscript will be revised and the peripheral observations (Figure. 2) will be shifted to the supplementary information.

      (5) Line 84 needs to start with a description of the goal, to produce a knockout.

      Details on the knockout will be elaborated in the revised manuscript. Line number 84 (now 75). Dictyostelium cell lines carrying mutations in the gene adgf were obtained from the genome wide Dictyostelium insertion (GWDI) bank and were subjected to further analysis to know the role of adgf during Dictyostelium development.

      (6) Knockout data (Figure 1) can be simplified and combined with a description of the expression profile and phenotype Figure 3 F, G, and Figure 5. Higher magnification and better resolution photographs of the mutants would be desirable.

      Thank you, as suggested the data will be simplified (section E will be removed) and combined with a description of the expression profile and, the phenotype images of Figure 3 F, G, and Figure 5 ( now Figure. 2 F, G, and Figure. 4) will be replaced with better images/ resolution.

      (7) It would also be relevant to know which cells actually express ADGF during development, using in-situ hybridisation or promoter-reporter constructs.

      To address whether adgf expression is cell type-specific, prestalk and prespore cells will be separated by fluorescence activated cell sorter (FACS), and thereafter, adgf expression will be examined in each population.

      (8) Figure 2 - Information is less directly relevant to the topic of the paper and can be omitted (or possibly in Supplementary Materials).

      Figure. 2 will be moved to supplementary materials.

      (9) Figures 4A, B - It is shown that as could be expected ada activity is somewhat reduced and adenosine levels are slightly elevated. However, the fact that ada levels are low at 16hrs could just imply that differentiation of the ADGF- cells is blocked/delayed at an earlier time point. To interpret these data, it would be necessary to see an ada activity and adenosine time course comparison of wt and mutant, or to see that expression is regulated in a celltype specific manner that could explain this (see above). It would be good to combine this with the observation that ammonia levels are lower in the ADGF- mutant than wildtype and that the mutant phenotype, mound arrest can be rescued by an external supply of ammonia (Figure 6).

      In Dictyostelium four isoforms of ADA including ADGF are present, and thus the time course of total ADA activity will also report the function of other isoforms. Further, a number of pathways, generate adenosine (Dunwiddie et al., 1997; Boison and Yegutkin, 2019). ADGF expression was examined at 0, 8, 12 and 16 h (Fig 1) and the ADA activity was assayed at 12 h, the time point where the expression gradually increases and reaches a peak at 16 h. Earlier, we have not shown the 12 h activity data which will be included in the revised version. ADGF expression was found to be highly elevated at 16 h and adenosine/ammonia levels were measured at the two points indicated in the mutant.

      (10) Panel 4C could be combined with other measurements trying to arrive at more insight in the mechanisms by which ammonia controls tip formation.

      Panel 4C (now 3C) illustrates the genes involved in the conversion of cAMP to adenosine. Since Figure. 3 focuses on adenosine levels and ADA activity in both WT and adgf mutants, we have retained Panel 3C in Figure. 3, for its relevance to the experiment.

      (11) There is a large variety of experiments attempting to link the mutant phenotype and its rescue by ammonia to cAMP signalling, however, the data do not yet provide a clear answer.

      It is well known that ammonia increases cAMP levels (Riley and Barclay, 1990; Feit et al., 2001) and adenylate cyclase activity (Cotter et al., 1999) in D. discoideum, and exposure to ammonia increases acaA expression (Fig 7B) suggesting that ammonia regulates cAMP signaling. To address the concerns, cAMP levels will be quantified in the mutant after ammonia treatment.

      (12) The mutant is shown to have lower cAMP levels at the mound stage which ties in with low levels of acaA expression (Figures 7A and B), also various phosphodiesterases, the extracellular phosphodiesterase pdsa and the intracellular phosphodiesterase regA show increased expression. Suggesting a functional role for cAMP signalling is that the addition of di cGMP, a known activator of acaA, can also rescue the mound phenotype (Figure 7E). There appears to be a partial rescue of the mound arrest phenotype level by the addition of 8Br-cAMP (fig 7C), suggesting that intracellular cAMP levels rather than extracellular cAMP signalling can rescue some of the defects in the ADGF- mutant. Better images and a time course would be helpful.

      The relevant images will be replaced and a developmental time course after 8-Br-cAMP treatment will be included in the revised manuscript (Figure. 6D).

      (13) There is also the somewhat surprising observation that low levels of caffeine, an inhibitor of acaA activation also rescues the phenotype (Figure 7F).

      With respect to caffeine action on cAMP levels, the reports are contradictory. Caffeine has been reported to increase adenylate cyclase expression thereby increasing cAMP levels (Hagmann, 1986) whereas Alvarez-Curto et al., (2007) found that caffeine reduced intracellular cAMP levels in Dictyostelium. Caffeine, although is a known inhibitor of ACA, is also known to inhibit PDEs (Nehlig et al., 1992; Rosenfeld et al., 2014). Therefore, if caffeine differentially affects ADA and PDE activity, it may potentially counterbalance the effects and rescue the phenotype.

      (14) The data attempting to asses cAMP wave propagation in mounds (Fig 7H) are of low quality and inconclusive in the absence of further analysis. It remains unresolved how this links to the rescue of the ADGF- phenotype by ammonia. There are no experiments that measure any of the effects in the mutant stimulated with ammonia or di-cGMP.

      The relevant images will be replaced (now Figure. 6H). Ammonia by increasing acaA expression (Figure. 7B), and cAMP levels (Figure. 7C) may restore spiral wave propagation, thereby rescuing the mutant.

      (15) A possible way forward could also come from the observation that ammonia can rescue the wobbling mound arrest phenotype from the histidine kinase mutant dhkD null mutant, which has regA as its direct target, linking ammonia and cAMP signalling. This is in line with other work that had suggested that another histidine kinase, dhkC transduces an ammonia signal sensor to regA activation. A dhkC null mutant was reported to have a rapid development phenotype and skip slug migration (Dev. Biol. (1998) 203, 345). There is no direct evidence to show that dhkD acts upstream of ADGF and changes in cAMP signalling, for instance, measurements of changes in ADA activity in the mutant.

      cAMP levels will be quantified in the dhkD mutant after ammonia treatment and accordingly, the results will be revised.

      (16) The paper makes several further observations on the mutant. After 16 hrs of development the adgf- mutant shows increased expression of the prestalk cell markers ecmA and ecmB and reduced expression of the prespore marker pspA. In synergy experiments with a majority of wildtype, these cells will sort to the tip of the forming slug, showing that the differentiation defect is cell autonomous (Fig 9). This is interesting but needs further work to obtain more mechanistic insight into why a mutant with a strong tip/stalk differentiation tendency fails to make a tip. Here again, knowing which cells express ADGF would be helpful.

      The adgf mutant shows increased prestalk marker expression in the mound but do not form a tip. It is well known that several mound arrest mutants form differentiated cells but are blocked in development with no tips (Carrin et al., 1994). This is addressed in the discussions (539). To address whether adgf expression is cell type-specific, prestalk and prespore cells will be separated by fluorescence activated cell sorter (FACS), and thereafter, adgf expression will be examined in each population.

      (17) The observed large mound phenotype could as suggested possibly be explained by the low ctn, smlA, and high cadA and csA expression observed in the mutant (Figure 3). The expression of some of these genes (csA) is known to require extracellular cAMP signalling. The reported low level of acaA expression and high level of pdsA expression could suggest low levels of cAMP signalling, but there are no actual measurements of the dynamics of cAMP signalling in this mutant to confirm this.

      The acaA expression was examined at 8 and 12 h (Figure. 6B) and cAMP levels were measured at 12 and 16 h in the adgf mutants (Figure. 6A). Both acaA expression and cAMP levels were reduced, suggesting that cells expressing adgf regulate acaA expression and cAMP levels. This regulation, in turn, is likely to influence cAMP signaling, collective cell movement within mounds, ultimately driving tip development. Exposure to ammonia led to increased acaA expression (Figure. 7B) in in WT. Based on the comments above, cAMP levels will be measured in the mutant before and after rescue with ammonia.

      (18) Furthermore, it would be useful to quantify whether ammonia addition to the mutant reverses mound size and restores any of the gene expression defects observed.

      Ammonia treatment soon after plating or six hours after plating, had no effect on the mound size (Figure. 5G).

      (19) There are many experimental data in the supplementary data that appear less relevant and could be omitted Figure S1, S3, S4, S7, S8, S9, S10.

      Figure S8, S9, S10 are omitted. We would like to retain the other figures

      Figure S1 (now Figure. S2): It is widely believed that ammonia comes from protein (White and Sussman, 1961; Hames and Ashworth, 1974; Schindler and Sussman, 1977) and RNA (Walsh and Wright, 1978) catabolism. Figure. S2 shows no significant difference in protein and RNA levels between WT and adgf mutant strains, suggesting that adenosine deaminaserelated growth factor (ADGF) activity serves as a major source of ammonia and plays a crucial role in tip organizer development in Dictyostelium. Thus, it is important to retain this figure.

      Figure S3 (now Figure. S4): The figure shows the treatment of various mound arrest mutants and multiple tip mutants with ADA enzyme and DCF, respectively, to investigate the pathway through which adgf functions. Additionally, it includes the rescue of the histidine kinase mutant dhkD with ammonia, indicating that dhkD acts upstream of adgf via ammonia signalling. Therefore, it is important to retain this figure.

      Figure S4 (now Figure. S5): This figure represents the developmental phenotype of other deaminase mutants. Unlike adgf mutants, mutations in other deaminases do not result in complete mound arrest, despite some of these genes exhibiting strong expression during development. This underscores the critical role of adenosine deamination in tip formation. Therefore, let this figure be retained.

      Figure S7 (now Figure. S8): Figure S8 presents the transcriptomic profile of ADGF during gastrulation and pre-gastrulation stages across different organisms, indicating that ADA/ADGF is consistently expressed during gastrulation in several vertebrates (Pijuan-Sala et al., 2019; Tyser et al., 2021). Notably, the process of gastrulation in higher organisms shares remarkable similarities with collective cell movement within the Dictyostelium mound (Weijer, 2009), suggesting a previously overlooked role of ammonia in organizer development. This implies that ADA may play a fundamental role in regulating morphogenesis across species, including Dictyostelium and vertebrates. Therefore, we would like to retain this figure.

      (20) Given the current state of knowledge, speculation about the possible role of ADGF in organiser function in amniotes seems far-fetched. It is worth noting that the streak is not equivalent to the organiser. The discussion would benefit from limiting itself to the key results and implications.

      The discussion is revised accordingly by removing the speculative role of ADGF in organizer function in amniotes. The lines “It is likely that ADA plays a conserved, fundamental role in regulating morphogenesis in Dictyostelium and other organisms including vertebrates” have been removed.

    1. Reviewer #1 (Public review):

      Summary:

      The paper sets out to examine the social recognition abilities of a 'solitary' jumping spider species. It demonstrates that based on vision alone spiders can habituate and dishabituate to the presence of conspecifics. The data support the interpretation that these spiders can distinguish between conspecifics on the basis of their appearance.

      Strengths:

      The study presents two experiments. The second set of data recapitulates the findings of the first experiment with a independent set of spiders, highlighting the strength of the results. The study also uses a highly quantitative approach to measuring relative interest between pairs of spiders based on their distance.

      Weaknesses:

      The study design is overly complicated, while missing key controls, and the data presented in the figures are not clearly connected to study. The discussion is challenging to understand and appears to make unsupported conclusions.

      (1) Study design: The study design is rather complicated and as a result it is difficult to interpret the results. The spiders are presented with the same individual twice in a row, called a habituation trial. Then a new individual is presented twice in a row. The first of these is a dishabituation trial and the second another habituation trial (but now habituating to a second individual). This done with three pairings and then this entire structure is repeated over three sessions. The data appear to show the strong effects of differences between habituation and dishabituation trials in the first session. The decrease in differential behavior between the so-called habituation and dishabituation trials in sessions 2 and 3 are explained as a consequence of the spiders beginning to habituate in general to all of the individuals. The claim that the spiders remember specific individuals is somewhat undercut because all of the 'dishabituation' trials in session 2 are toward spiders they already met for 14 minute previously but seemingly do not remember in session 2. In session 3 it is ambiguous what is happening because the spiders no longer differentiate between the trial types. This could be due to fatigue or familiarity. A second experiment is done to show that introducing a totally novel individual, recovers a large dishabituation response, suggesting that the lack of differences between 'habituation' and 'dishabituation' trials in session 3 is the result of general habituation to all of the spiders in the session rather than fatigue. As mentioned before, these data do support the claim that the spiders differentiate among individuals.

      The data from session 1 are easy to interpret. The data from sessions 2 and 3 are harder to understand, but these are the trials in which they meet an individual again after a substantial period of separation. Other studies looking at recognition in ants and wasps (cited by the authors) have done a 4 trial design in which focal animal A meets B in the first trial, then meet C in the second trial, meets B again in the third trial, and then meets D in the last trial. In that scenario trials 1, 2 and 4 are between unfamiliar individuals and trial 3 is between potentially familiar individuals. In both the ants and wasps, high aggression is seen in species with and without recognition on trial 1, with low aggression specifically for trials with familiar individuals in species with recognition. Across different tests, species or populations that lack recognition have shown a general reduction in aggression towards all individuals that becomes progressively less aggressive over time (reminiscent of the session 2 and 3 data) while others have maintained modest levels of aggression across all individuals. The 4 session design used in those other studies provides an unambiguous interpretation of the data, while controlling for 'fatigue'. That all trials in sessions 2 and 3 are always with familiar individuals make it challenging to understand how much the spiders are habituating to each other versus having some kind of associative learning of individual identity and behavior.

      The data presentation is also very complicated. How is it the case that a negative proportion of time is spent? The methods reveal that this metric is derived by comparing the time individuals spent in each region relative to the previous time they saw that individual. At the very least, data showing the distribution of distances from the wall would be much easier to interpret for the reader.

      (2) "Long-term social memory": It is not entirely clear what is meant by the authors when they say 'long-term social memory', though typically long-term memory refers to a form of a memory that require protein synthesis. While the precise timing of memory formation varies across species and contexts, a general rule is that long term memory should last for > 24 hours (e.g., Dreier et al 2007 Biol Letters). The longest time that spider are apart in this trial set up is something like an hour. There is no basis to claim that spiders have long term social memory as they are never asked to remember anyone after a long time apart. The odd phrasing of the 'long-term dishabutation' trial makes it seem that it is testing a long-term memory, but it is not. The spiders have never met. The fact that they are very habituated to one set of stimuli and then respond to a new stimulus is not evidence of long-term memory. To clearly test memory (which is the part really lacking from the design), the authors would need to show that spiders - upon the first instance of re-encountering a previously encountered individual are already 'habituated' to them but not to some other individuals. The current data suggest this may be the case, but it is just very hard to interpret given the design does not directly test memory of individuals in a clear and unambiguous manner.

      (3) Lack of a functional explanation and the emphasis on 'asociality': It is entirely plausible that recognition is pleitropic byproduct of the overall visual cognition abilities in the spiders. However, the discussion that discounts territoriality as a potential explanation is not well laid out. First, many species that are 'asocial' nevertheless defend territories. It is perhaps best to say such species are not group living, but they have social lives because they encounter conspecifics and need to interact with them. Indeed, there are many examples of solitary living species that show the dear enemy effect, a form of individual recognition, towards familiar territorial neighbors. The authors in this case note that territorial competition is mediated by the size of color of the chelicerae (seemingly a trait that could be used to distinguish among individuals). Apparently because previous work has suggested that territorial disputes can be mediated by a trait in the absence of familiarity has led them to discount the possibility that keeping track of the local neighbors in a potentially cannibalistic species could be a sufficient functional reason. In any event, the current evidence presented certainly does not warrant discounting that hypothesis.

      Comments on Revision:

      The authors have not actually addressed my points and their comments conflate discrimination with recognition. The extensive discussion about how babies are tested for discrimination tasks in their rebuttal misses the point. I believe that the data do show that the spiders discriminate between individuals but whether individuals are recognized (i.e., remembered) is less clear. The authors defend their convoluted study design, but it is overly complex and challenging to interpret the data as a result.

      The main issue with the design is that they do not actually test for any kind of memory of specific individuals after a substantial time of separation. Instead they show that a new individuals is still surprising/dishabituating. That is nice evidence for discrimination but does not show memory in a clear and unambiguous way.

      My comments and critique are unchanged since they didn't really change the paper. New experiments were needed and they didn't do any. Perhaps it is hard to get the spiders where they are? I don't really understand why they didn't do additional experiments as part of this revision.

    2. Reviewer #3 (Public review):

      Summary:

      Jumping spiders (family Salticidae) have extraordinarily good eyesight, but little is known about how sensitive these small animals might be to the identity of other individuals that they see. Here, experiments were carried out using Phidippus regius, a salticid spider from North America. There were three steps in the experiments; first, a spider could see another spider; then its view of the other spider was blocked; and then either the same or a different individual spider came into view. Whether it was the same or a different individual that came into view in the third step had a significant effect on how close together or far apart the spiders positioned themselves. It has been demonstrated before that salticids can discriminate between familiar and unfamiliar individuals while relying on chemical cues, but this new research on P. regius provides the first experimental evidence that a spider can discriminate by sight between familiar and unfamiliar individuals.

      Clark RJ, Jackson RR (1995) Araneophagic jumping spiders discriminate between the draglines of familiar and unfamiliar conspecifics. Ethology, Ecology and Evolution 7:185-190

      Strengths:

      This work is a useful step toward a fuller understanding of the perceptual and cognitive capacities of spiders and other animals with small nervous systems. By providing experimental evidence for a conclusion that a spider can, by sight, discriminate between familiar and unfamiliar individuals, this research will be an important milestone. We can anticipate a substantial influence on future research.

      Weaknesses:

      (1) The conclusions should be stated more carefully.

      (2) It is not clearly the case that the experimental methods are based on 'habituation (learning to ignore; learning not to respond). Saying 'habituation' seems to imply that certain distances are instances of responding and other distances are instances of not responding but, as a reasonable alternative, we might call distance in all instances a response. However, whether all distances are responses or not is a distracting issue because being based on habituation is not a necessity.

      (3) Besides data related to distances, other data might have been useful. For example, salticids are especially well known for the way they communicate using distinctive visual displays and, unlike distance, displaying is a discrete, unambiguous response.

      (4) Methods more aligned with salticids having extraordinarily good eyesight would have useful. For example, with salticids, standardising and manipulating stimuli in experiments can be achieved by using mounts, video playback and computer-generated animation.

      (5) An asocial-versus-social distinction is too imprecise, and it may have been emphasised too much. With P. regius, irrespective of whether we use the label asocial or social, the important question pertains to the frequency of encounters between the same individuals and the consequences of these encounters.

      (6) Hypotheses related to not-so-strictly adaptive factors are discussed and these hypotheses are interesting, but these considerations are not necessarily incompatible with more strictly adaptive influences being relevant as well.

      Comments on Revision:

      The authors have responded reasonably to the comments I made. There is nothing else that I wish to add.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper sets out to examine the social recognition abilities of a 'solitary' jumping spider species. It demonstrates that based on vision alone spiders can habituate and dishabituate to the presence of conspecifics. The data support the interpretation that these spiders can distinguish between conspecifics on the basis of their appearance.

      We appreciate the reviewer’s summary. We indeed aimed at investigating the social recognition abilities of the solitary jumping spider (Phidippus regius), using visual cues alone. By employing a habituation-dishabituation paradigm, well-established in developmental psychology, we found support for the interpretation that these spiders can distinguish between conspecifics based on their appearance, as the reviewer noted.

      Strengths:

      The study presents two experiments. The second set of data recapitulates the findings of the first experiment with an independent set of spiders, highlighting the strength of the results. The study also uses a highly quantitative approach to measuring relative interest between pairs of spiders based on their distance.

      We appreciate the reviewer's acknowledgement of the strengths of our study. The second set of data underscores the robustness and reliability of the results. Additionally, however, the second experiment served the purpose of disentangling whether the habituation effect observed over sessions was caused by ‘physical’ or ‘cognitive’ fatigue by employing ‘long-term’ dishabituation trials at the end of Session 3. These trials are critical in our study as they help to differentiate between recognition of individual identities versus recognition of familiar individuals (as opposed to unfamiliar ones) and to determine if the observed effects are due to ‘general habituation’ or ‘specific recognition’. We will elaborate on this further below in this revision.

      As stated by the reviewer, we employed a highly quantitative approach to measure relative interest between pairs of spiders based on their distance, providing precise and objective data to support our conclusions.

      Weaknesses:

      The study design is overly complicated, missing key controls, and the data presented in the figures are not clearly connected to the study. The discussion is challenging to understand and appears to make unsupported conclusions.

      While we acknowledge that the study design is indeed complex, this complexity is essential for conducting a well-controlled and balanced experiment regarding the experimental conditions.  

      The habituation-dishabituation paradigm is a well-established paradigm in developmental psychology with non-verbal infants. It is understood that during the habituation phase, an individual's attention to a repeated stimulus decreases as they engage in information processing and form a mental representation of it. As the stimulus becomes familiar, it loses its novelty and interest. When a new stimulus is introduced, a recovery of attention suggests that the individual has compared this new stimulus to the stored memory of the habituation stimulus and detected a difference. This process suggests that the individual not only remembered the original stimulus but also recognized the new one as distinct (for a review Kavšek & Bornstein, 2010).

      This paradigm has also been extensively applied in animal research, where, like infants, nonverbal subjects rely on recognition and discrimination processes to demonstrate their cognitive abilities. The use of this paradigm dates back to seminal studies such as Humphrey (1974), which explored the perceptual world of monkeys, illustrating how species and individuals are perceived and recognized. In another previous study (Dahl, Logothetis, and Hoffman, 2007), we utilized an even more complex experimental design that incorporated dedicated baseline trials for both habituation and dishabituation phases, which was well-received despite its complexity. In the current study, we contrast dishabituation and habituation trials directly, creating a sequential cascade where each trial is evaluated against the preceding one as its baseline.

      On the basis of these arguments, we respectfully decline the claim that this paradigm is inappropriate or lacks key controls. Our study design, though complex, is rigorously grounded in established methodologies and offers a robust framework for exploring individual recognition in Phidippus regius.

      However, we take the reviewer’s comments seriously and are committed to identifying and addressing the aspects in our manuscript that may have led to misunderstandings. We clarify these areas in our revision of the manuscript. Modifications were made in the Introduction, Methods, and Discussion sections.

      Dahl, C. D., Logothetis, N. K., & Hoffman, K. L. (2007). Individuation and holistic processing of faces in rhesus monkeys. Proceedings of the Royal Society B: Biological Sciences, 274(1622), 2069-2076.

      Humphrey, N. K. (1974). Species and individuals in the perceptual world of monkeys. Perception, 3(1), 105-114.

      Kavšek, M., & Bornstein, M. H. (2010). Visual habituation and dishabituation in preterm infants: A review and meta-analysis. Research in developmental disabilities, 31(5), 951-975.

      (1) Study design: The study design is rather complicated and as a result, it is difficult to interpret the results. The spiders are presented with the same individual twice in a row, called a habituation trial. Then a new individual is presented twice in a row. The first of these is a dishabituation trial and the second is another habituation trial (but now habituating to a second individual). This is done with three pairings and then this entire structure is repeated over three sessions. 

      While we acknowledge that the design is complex, this complexity is essential for conducting a well-controlled experiment, as described earlier. As the reviewer noted, our design involves presenting the same individual to the focal spider twice in a row (habituation trial), followed by a new individual (dishabituation trial), and then repeating this structure. This approach is fundamental to the habituation-dishabituation paradigm, which allows us to systematically compare the responses to a familiar individual with those elicited by a novel one. If the spiders exhibit different behaviours in terms of the distance they maintain when encountering the same individual versus a new one, it indicates that they are processing the stimuli differently, consistent with recognition memory. This differential response is a key indicator that the spiders can distinguish between familiar and unfamiliar individuals, demonstrating not only a decrease in interest or engagement due to repeated exposure but also a cognitive process where the lack of a matching memory template triggers a distinct behavioural response when confronted with novel stimuli.

      By repeating this sequence two more times (Session 2 and 3), we aim to assess the consistency of this recognition process over time. If the focal spider does not remember the individuals from the previous session (one hour ago), we expect consistent behavioural responses across sessions. Conversely, if there is a decrease in response magnitude but the overall response patterns are maintained, we can infer that the focal spider recognizes the previously presented individuals and exhibits habituation, reflected in reduced response intensity. In other words, over sessions and repeated exposure to the same individuals, the memory traces become more firmly established, leading to a situation where a dishabituation trial introduces less novelty, as the spider's recognition of previously encountered individuals becomes more robust and consistent to the point where “habituation” and “dishabituation” trials become indistinguishable, as observed in Session 3. This method allows us to assess the duration of identity recognition in these spiders, indicating how long the memory of specific individuals persists. 

      All of these outcomes were anticipated before we began Experiment 1. Given that the results aligned with our predictions, we then sought to determine whether the observed reduction in the magnitude of the effect (i.e., the difference between habituation and dishabituation trials) was due to a physical fatigue effect, where the spiders might simply be getting tired, or a cognitive fatigue effect, where the spiders recognized the individuals and as a result did not exhibit any novelty response. To address this, we replicated the experiment with a new group of spiders and introduced special (long-term dishabituation) trials at the end, where the focal spider was presented with a novel spider. 

      These extra trials allowed us to disentangle the nature of the diminishing response across repeated sessions: a lack of dishabituation (remaining distant) would suggest general physical fatigue, whereas a strong dishabituation response (approaching closely) to the novel spider would indicate cognitive fatigue, thereby confirming that the spiders were indeed recognizing the familiar individuals throughout the experiment. 

      In light of these considerations, we believe that the complexity of our design is not only justified but absolutely necessary to rigorously test the cognitive capabilities of the spiders. Nonetheless, we understand the need for clarity in presenting our findings and are committed to refining our manuscript to better communicate the rationale and results of our study.

      The data appear to show the strong effects of differences between habituation and dishabituation trials in the first session. The decrease in differential behavior between the socalled habituation and dishabituation trials in sessions 2 and 3 is explained as a consequence of the spiders beginning to habituate in general to all of the individuals. 

      The key question, as mentioned above, is to determine the underlying cause of this general habituation across sessions. Specifically, we aim to differentiate between two potential causes: physical fatigue, where the spiders may simply become less responsive due to the demands of the three-hour testing period, or cognitive fatigue, where the repeated exposure to the same individuals leads to a decreased response because the spiders have started to recognize these individuals over multiple repetitions.

      To address this, we replicated the experiment and introduced each focal spider to a new individual in what we termed "long-term dishabituation" trials. By comparing the spiders' responses to these novel individuals with their responses in earlier trials, we sought to better understand the underlying mechanisms of habituation and the duration of individual recognition. The strong dishabituation response observed in these trials is indicative of cognitive fatigue, supporting the presence of recognition memory rather than a general physical fatigue effect.

      The claim that the spiders remember specific individuals is somewhat undercut because all of the 'dishabituation' trials in session 2 are toward spiders they already met for 14 minutes previously but seemingly do not remember in session 2. 

      We appreciate the reviewer’s comment regarding the claim that spiders do not remember specific individuals. This assessment does not align with the rationale of our experiment. The reviewer noted that the dishabituation trials in session 2 involved spiders previously encountered and suggested that the lack of a clear memory response might undercut the claim of specific individual recognition. 

      However, as we explained earlier, we expect habituation in Session 2 relative to Session 1 precisely because spiders recognize each other in Session 2. If there were no such habituation in Sessions 2 or 3, it would suggest that the spiders’ recognition memory does not persist beyond one hour. 

      Additionally, it is important to correct the timing noted by the reviewer: each individual spider reencounters the same spider exactly one hour later, not 14 minutes. This is detailed in Table 2 of the manuscript, which outlines that each trial lasts 7 minutes, with a 3-minute visual separation between trials. With six trials per session, this totals to 1 hour per session. Thus, every pair of spiders re-encounters exactly 1 hour after their last interaction.

      Again, it is important to clarify that the observed decrease in differential behaviour is not indicative of a failure to remember specific individuals. Rather, it reflects a systematic pattern of habituation, which is a common and expected outcome in such paradigms. This systematic decrease in response strength suggests that the spiders recognize the previously encountered individuals and becoming less responsive over repeated exposures, consistent with the process of habituation. In different terms, the repeated exposure to the same individuals leads to more firmly established memory traces, leading to a situation where a dishabituation trial introduces less novelty, as the spider's recognition of previously encountered individuals becomes more robust and consistent.

      Based on the explanations provided above, we respectfully reject the claim that “the spiders remember specific individuals is somewhat undercut […]”. In contrast, this claim is incorrect, as the exact opposite is true. The very strength of our study lies in demonstrating that spiders possess robust recognition memory, as evidenced by a clear dissociation of habituation and dishabituation trials in Session 1, followed by a gradually diminishing effect over Session 2 and 3 as the spiders are increased exposed to the same individuals: Furthermore, the strong rebound from habituation observed in long-term dishabituation trials, where the spiders were exposed to novel individuals. 

      This misunderstanding suggests that we should take additional care in the revised manuscript to clarify our explanations and provide more detail, ensuring that the rationale behind our experimental design and findings are communicated effectively.

      In session 3 it is ambiguous what is happening because the spiders no longer differentiate between the trial types. This could be due to fatigue or familiarity. 

      The reviewer proposes that the absence of differentiation between 'habituation' and 'dishabituation' trials in Session 3 might be attributed to either fatigue or familiarity. We interpret "fatigue" as what we have termed the “physical fatigue effect” and "familiarity" as “cognitive fatigue effect.” In this context, we concur with the reviewer’s observation, and this very line of reasoning prompted us to conduct a further experiment following the outcome of Experiment 1.

      A second experiment is done to show that introducing a totally novel individual, recovers a large dishabituation response, suggesting that the lack of differences between 'habituation' and 'dishabituation' trials in session 3 is the result of general habituation to all of the spiders in the session rather than fatigue. As mentioned before, these data do support the claim that spiders differentiate among individuals.

      As the reviewer rightly noted, we addressed these possibilities in our second experiment by introducing a completely novel individual to the spiders, which resulted in a strong dishabituation response. This outcome suggests that the lack of differentiation in Session 3 is more likely due to cognitive habituation rather than physical fatigue. The robust response to novel individuals demonstrates that the spiders are capable of distinguishing between familiar and unfamiliar individuals, suggesting that the reduced differentiation is a consequence of habituation from repeated encounters with the same individuals. 

      We appreciate the reviewer's recognition that these findings support the conclusion that spiders are capable of differentiating between individual conspecifics.

      Additionally, it is important to clarify the structure of our sessions. Each of the 6 trials lasts 7 minutes with a 3-minute visual separation, resulting in a total of 1 hour per session. This ensures that each pair of spiders is encountered exactly one hour later, which controls for the timing and allows us to evaluate the spiders' recognition memory over repeated sessions.

      In summary, while the data show a decrease in differential behaviour between habituation and dishabituation trials in Session 2 and 3, the results from our second experiment support the interpretation that this is due to ‘cognitive habituation’ (familiarization) rather than ‘physical fatigue’ (general habituation). This habituation effect underscores the spiders' ability to recognize and become familiar with specific individuals over time, reinforcing our conclusion that they can differentiate among individuals.

      The data from session 1 are easy to interpret. The data from sessions 2 and 3 are harder to understand, but these are the trials in which they meet an individual again after a substantial period of separation. 

      The data from Session 1 are straightforward to interpret, showing clear differences between habituation and dishabituation trials. However, the data from Sessions 2 and 3 are more complex, as these sessions involve the spiders re-encounter individuals after a 1-hour period of separation. Importantly, the outcome is not an artefact in our experiment, but the consequence of a deliberate choice in the experimental design to assess whether spiders can recognise each other after this duration. We believe that this complexity aligns with our expectations, based on the assumption that spiders can recognise each other after one hour. The observed pattern of habituation in Sessions 2 and 3 suggests that the spiders retain memory of the individuals, leading to decreased responsiveness upon repeated encounters. This interpretation is further supported by the Experiment 2, which introduced a novel individual and elicited a strong dishabituation response. This finding confirms that the reduced differentiation in later sessions is due to cognitive habituation rather than physical fatigue, supporting the conclusion that recognition memory last at least one hour.

      We hope this explanation clarifies our findings and the rationale behind our relatively complex experimental design choice. 

      Other studies looking at recognition in ants and wasps (cited by the authors) have done a 4 trial design in which focal animal A meets B in the first trial, then meets C in the second trial, meets B again in the third trial, and then meets D in the last trial. In that scenario trials 1, 2, and 4 are between unfamiliar individuals and trial 3 is between potentially familiar individuals. In both the ants and wasps, high aggression is seen in species with and without recognition on trial 1, with low aggression specifically for trials with familiar individuals in species with recognition. Across different tests, species or populations that lack recognition have shown a general reduction in aggression towards all individuals that become progressively less aggressive over time (reminiscent of the session 2 and 3 data) while others have maintained modest levels of aggression across all individuals. The 4 session design used in those other studies provides an unambiguous interpretation of the data while controlling for 'fatigue'. 

      We acknowledge that there are multiple ways to design experiments to test recognition memory. In fact, we considered using the paradigm similar to the one proposed by the reviewer and used in studies like Dreier et al., which involves a series of trials with unfamiliar and familiar individuals over extended intervals. We then, however, opted for a more complex design to rigorously assess how habituation and recognition memory develop over repeated sessions with shorter intervals.

      In the following, we would like to describe the advantages and disadvantages of both paradigms and outline how we ended up using the more complex version:

      Advantages of our paradigm: 

      As pointed out, by repeating the sequence in exactly similar manner (every same pair of spiders reoccurs after exactly 1 and 2 hours), we can comprehensively evaluate the effect of habituation over multiple exposures. This allows us to assess the extent of the spiders’ memory, when a spider shows stronger habituation to individuals that were novel in Session 1 but “familiar” by the time they encounter them again in Session 2. To achieve this, we need to ensure that each trial and visual separation is precisely timed, ensuring consistent intervals between encounters. As a consequence, each individual spider undergoes the exact same experimental protocol. Most critically, however, are the novel individuals presented after Session 3 (long-term dishabituation trials) that help differentiate between cognitive habituation and physical fatigue.  Disadvantages of our paradigm:

      The sequences of habituation and dishabituation trials may make the design more complex, as pointed out by the reviewer. As a consequence, the interpretation will become more difficult. However, the data perfectly align with our predictions, and the outcomes were as anticipated in two independently run experiments with two groups of spiders. This highlights the reliability of our experimental design and robustness of our findings.

      Advantages of the 4-trial paradigm proposed by the reviewer:

      Clearly, the structure of the proposed design is simpler, making interpretation easier. The paradigm also accommodates longer intervals between trials (e.g., 24 hours). Longer intervals could theoretically have been applied in our study. (However, we chose not to leave the spiders in the experimental box longer than necessary, opting instead to return them to their home containers for the night to ensure their well-being. And, a 24-hour interval targets a different phase in the process of long-term memory, but more to this topic further below.)

      Disadvantages of the 4-trial paradigm proposed by the reviewer:

      Strictly replicating the 4-trial design would result in one familiar encounter versus three unfamiliar ones. This imbalance might introduce bias and limit the robustness of the measurements. Additionally, the design provides less data overall, as the focal individual will be confronted with three other individuals, who will then be excluded from further testing as focal subjects themselves. In contrast, our design ensures a balanced number of familiar0020(habituation) and novel encounters (dishabituation) for each focal individual, allowing for more efficient and comprehensive data collection without excluding individuals from further testing.

      Given the aforementioned considerations, we determined that the advantages of our experimental design, in particular the assessment of a cognitive fatigue effect when encountering the same individuals again, outweigh those of the proposed 4-trial design. The mentioned limitations of the 4-trial design, such as the potential for bias and less comprehensive data collection, do not justify re-running the study, especially when the best case scenario is fewer insights than our already existing findings. Our current paradigm yielded results that align perfectly with our predictions, offering a thorough and reliable understanding of recognition memory and habituation in spiders. Therefore, we believe our approach provides a more complete and robust answer to our research questions.

      However, we acknowledge that there might be insufficient information in the manuscript addressing the rationale behind our design choices, and we will revise the manuscript to provide a clearer explanation of why our approach is well suited to answering the research questions at hand.

      That all trials in sessions 2 and 3 are always with familiar individuals makes it challenging to understand how much the spiders are habituating to each other versus having some kind of associative learning of individual identity and behavior.

      We understand the reviewer's concern that having all trials in Sessions 2 and 3 involve familiar individuals could make it challenging to distinguish between general habituation and associative learning of individual identities. In our study, we contrast habituation and dishabituation trials: If general habituation were occurring, we would expect uniformly reduced responses (around the zero line) to all individuals over time, indicating that the spiders are getting used to any individual regardless of their specific identity. However, this is not the case. Our data show that while the responses in Session 2 are reduced in effect size compared to Session 1, they are not flat (around the zero line). This indicates that the spiders still differentiate between a repetition of a spider identity (habituation trials) and two different spider identities (dishabituation trials), albeit with a reduced response strength. The systematicity in the data suggests that the spiders are not merely habituating to any individual, but are instead retaining some level of recognition between specific individuals.

      Only by Session 3 do the spiders fully habituate to the point where the responses to habituation and dishabituation trials converge, indicating a complete habituation effect. The introduction of novel individuals in our long-term dishabituation trials further supports the idea that the spiders are recognizing specific individuals rather than exhibiting general habituation. If the spiders were experiencing general habituation, we would not expect the strong dishabituation response observed in our study.

      The data presentation is also very complicated. How is it the case that a negative proportion of time is spent? The methods reveal that this metric is derived by comparing the time individuals spent in each region relative to the previous time they saw that individual. 

      We understand the reviewer's concern regarding the complexity of the data presentation and the calculation of the negative proportion of time. Regarding the complexity of the design, we have already justified our choice of a more intricate experimental setup. This complexity is necessary for accurately assessing recognition memory and habituation over repeated sessions. 

      The metric is derived by comparing the time individuals spent in each region (relative to the transparent front panel) in the current trial (n) relative to the previous trial (n-1). With multiple trials, this results in a cascade of trials and conditions. This method was established in

      Humphrey’s and our previous study (Humphrey, 1974; Dahl, Logothetis, Hoffman, 2007), where we demonstrated its effectiveness in assessing individuation of faces in macaque monkeys.  

      Also in our current experimental design, each current trial is contrasted with the preceding one, allowing us to compare distributions of distances taken in two trials. In this context, every preceding trial serves as baseline for every current trial. 

      Figure 1 of the manuscript, illustrates the structure and analysis of the trials,

      Panel a depicts the baseline, habituation, and dishabituation trials, where spiders are exposed to different conspecifics.

      Baseline (left panel, red): When two spiders are visually exposed to each other for the first time, it is expected that they will explore each other closely, exhibiting high levels of proximity (initial exploratory behaviour).

      Habituation (centre panel, green): When the same spiders are reintroduced in a subsequent round of exposure, it is anticipated that they will exhibit reduced exploratory behaviour and maintain a greater distance compared to the baseline trial, if they recognize each other from the previous encounter (indicative of habituation).

      Panel b (upper and middle panels; red and green): Demonstrates the theoretical assumptions and expected changes in behaviour:

      By subtracting the distribution of distances in the baseline trial from the habituation trial, we generate a delta distribution. This delta distribution reveals negative values near the transparent panel (indicating reduced proximity in the habituation trial) and positive values at mid- to fardistances (indicating increased distancing behaviour). This delta distribution is also what is reported in Figure 2. 

      Dishabituation: In this trial, a new spider (different from the one in the habituation trial) is introduced. The dishabituation trial will be considered in contrast to the habituation trial described above. If the spider recognizes the new individual as different, it is expected to show increased exploratory behaviour and reduced distance, similar to the initial baseline trial.

      By subtracting the distribution of distances in the habituation trial from the dishabituation trial, we obtain another delta distribution. This delta distribution should reveal positive values near the transparent panel (indicating increased proximity in the dishabituation trial) and negative values at mid- to far-distances (indicating decreased proximity compared to the habituation trial).

      We hope this clarifies the rationale behind our data presentation and the methodological approach we employed. We have revised the figure to enhance its clarity and make it more intuitive for the reader.

      Dahl, C. D., Logothetis, N. K., & Hoffman, K. L. (2007). Individuation and holistic processing of faces in rhesus monkeys. Proceedings of the Royal Society B: Biological Sciences, 274(1622), 2069-2076.

      Humphrey, N. K. (1974). Species and individuals in the perceptual world of monkeys. Perception, 3(1), 105-114.

      At the very least, data showing the distribution of distances from the wall would be much easier to interpret for the reader.

      We understand the reviewer's concern that data showing the distribution of distances from the wall would be much easier to interpret for the reader. We initially consider that but came to the conclusion that this approach is not straightforward. For instance, if both spiders are positioned at the very front but in different corners, the distance to the panel would be very small, but the distance between the spiders would be large. Thus, using distances from the wall could misrepresent the actual spatial distribution between the spiders.

      (2) "Long-term social memory": It is not entirely clear what is meant by the authors when they say 'long-term social memory', though typically long-term memory refers to a form of a memory that requires protein synthesis.  

      To address this conceptually, we used the term "long-term social memory" to describe the spiders' ability to recognize and remember individual conspecifics over multiple experimental sessions. While social memory refers to the ability of an individual to recognize other individuals within a social context, long-term memory typically involves the retention of information over extended periods. Recognizing that the term “long-term social memory” is not commonly used, we have revised the manuscript to use the more standard term “long-term memory.”

      While the precise timing of memory formation varies across species and contexts, a general rule is that long-term memory should last for > 24 hours (e.g., Dreier et al 2007 Biol Letters). The longest time that spiders are apart in this trial setup is something like an hour. There is no basis to claim that spiders have long-term social memory as they are never asked to remember anyone after a long time apart.

      We appreciate the reviewer’s feedback regarding the term "long-term social memory." The statement "long-term memory should last for > 24 hours" is a generalisation in discussions about memory. It oversimplifies a more complex topic. That is, long-term memory is typically distinguished from short-term memory by its persistence over time, often lasting from hours to a lifetime. However, the exact duration that qualifies memory as "long-term" varies depending on the context, model species, and type of memory. In studies involved in synaptic plasticity (LTP), the object might indeed be to look at memory that persists for at least 24 hours as a criterion for long-term memory. In studies of cellular and/or molecular mechanisms where the stabilization and consolidation of memory traces over time are key areas of interest this 24-hour interval is very common. But, defining long-term memory strictly by a 24-hour duration is by no means universally accepted nor does it apply across all fields of study.

      To clarify, long-term memory is a process involving consolidation starting within minutes to hours after learning. Clearly, full consolidation can take longer, while memory persisting 24 hours is considered fully consolidated. But this does not mean that memory lasting less than 24 hours are not part of long-term memory. 

      In fact, Atkinson and Shiffrin (1969) proposed that information entering short-term memory remains there for about 20 to 30 seconds before being displaced due to space limitations. During this brief interval, initial encoding processes begin transferring information to long-term memory, establishing an initial memory trace. This transfer is not indicative of full consolidation but represents the initial "laying down" of the memory trace (encoding). In our study, the focal spider’s brain forms initial memory traces of the individuals it encounters. This process continues during the period of visual separation. Upon re-encountering the same individual a few minutes later, the spider accesses the initial memory trace stored in long-term memory. This trace is fragile and not fully consolidated. The re-encounter acts as a rehearsal, reactivating specific memory traces and potentially strengthening them through additional encoding processes, allowing the spider to recognize the individual even an hour later.

      According to Markowitsch (2013), initial encoding in long-term memory begins within seconds to minutes. It is also important to note that we argue for identity recognition rather than identity recall. Recognition involves correctly identifying a stimulus when it is presented again, while recall requires the volitional generation of information without an external stimulus. Thus, recall may rely on deeper forms of memory consolidation than recognition.

      Is protein synthesis required for long-term memory? 

      The role of protein synthesis in long-term memory has been extensively studied. According to Castellucci et al. (1978), explicit memory comprises a short-term phase that does not require protein synthesis and a long-term phase that does. Hebbian learning in its initial phase (early LTP) does not necessarily require protein synthesis. This phase involves the rapid strengthening of synapses through existing proteins and signaling pathways, such as the activation of NMDA receptors and the influx of Ca2+ ions. For the changes to persist (late LTP), protein synthesis is important. This phase involves the production of new proteins that contribute to long-term structural changes at the synapse, such as the growth of new synaptic connections or the stabilization of existing ones.

      This differentiation between the early and late phases of LTP highlights that long-term memory can begin forming without immediate protein synthesis. Our study focuses on this early phase of memory encoding, which involves the initial formation of memory traces that do not yet depend on protein synthesis. 

      It is however worth noting that recent research suggests that there is an early phase of protein synthesis (within minutes to hours) through the activation of immediate early genes (IEGs) and transcription factors. In this context, protein synthesis supports initial synaptic modifications. What the reviewer refers to is the consolidation phase (late phase), where continued synthesis of proteins induces structural changes at synapses, leading to the formation of new synaptic connections. In our study, it is plausible to assume that an early form of protein synthesis may contribute to stabilizing the initial memory traces during the encoding phase. However, whether or not protein synthesis occurred in our spiders is beyond the scope of this investigation and was not specifically addressed.

      The critical aspect of our study is that the information transitioned from short-term memory to long-term memory during an early encoding phase, allowing recall after an hour. Due to the inherent limitations and transient nature of the short-term memory, it is implausible for spiders to retain these memory representations solely within the short-term memory for such durations. Our findings suggest that the initial encoding processes were robust enough to transfer these experiences into long-term memory, where they were stabilized and could be accessed later. 

      In sum, it is important to note that long-term memory is a dynamic process, and while testing after 24 hours is a convention in some studies, this timing is arbitrary and not universally applicable to all contexts or species. The more critical consideration here is that we are dealing with a species where no prior evidence of long-term memory exists. Debating a 24-hour delay or the specifics of protein synthesis, while potentially interesting for future studies, detracts from the true significance of our findings. Our study is the first to show something akin to long-term memory representations in this species and this should remain in our focus.

      Shiffrin, R. M., & Atkinson, R. C. (1969). Storage and retrieval processes in long-term memory. Psychological review, 76(2), 179. 

      Markowitsch, H. J. (2013). Memory and self–Neuroscientific landscapes. International Scholarly Research Notices, 2013(1), 176027.

      Castellucci, V. F., Carew, T. J., & Kandel, E. R., 1978. Cellular analysis of long-term habituation of the gill-withdrawal reflex of Aplysia californica. Science, 202(4374), 1306-1308.

      The odd phrasing of the 'long-term dishabutation' trial makes it seem that it is testing a longterm memory, but it is not. The spiders have never met. The fact that they are very habituated to one set of stimuli and then respond to a new stimulus is not evidence of long-term memory. To clearly test memory (which is the part really lacking from the design), the authors would need to show that spiders - upon the first instance of re-encountering a previously encountered individual are already 'habituated' to them but not to some other individuals. The current data suggest this may be the case, but it is just very hard to interpret given the design does not directly test the memory of individuals in a clear and unambiguous manner.

      While we appreciate the reviewer's feedback, we believe there may have been some misunderstanding regarding the term “long-term dishabituation.” The introduction of novel individuals at the end of Session 3 was not intended to test long-term memory by having spiders recognize these novel individuals. Instead, it aimed to investigate the nature of the habituation observed over the three sessions.

      The novel individuals introduced at the end of Session 3 serve the purpose to differentiate between general habituation (a decline in response due to repeated exposure to any stimuli) and specific habituation (recognition and reduced response to previously encountered individuals). The novel spiders have never been encountered before, so the focal spiders cannot have prior representations of them. Thus, the strong dishabituation response to these novel individuals indicates that the habituation observed earlier is not due to a general fatigue effect or loss of interest but rather a specific habituation effect to the familiar individuals. By showing such strong and increased response to novel individuals, the study demonstrates that the spiders' increasingly reduced responses in Sessions 2 and 3 are not merely due to a general decrease in responsiveness but suggest cognitive habituation. This cognitive habituation implies that the spiders remember the familiar individuals (as each of them occurred three times across the three sessions), a process that relies on long-term memory. Therefore, while the novel spiders themselves are not a direct test of long-term memory, the use of these novel spiders helps us infer that the habituation observed over the three sessions is indeed due to the formation of long-term memory traces.

      In other words, the organism detects and processes the novel stimulus as different from the habituated one. In our study, if a spider showed a strong dishabituation response to a novel individual introduced at the end of Session 3, it would indicate that the spider had formed specific representations of the individuals they encountered during the three sessions. These representations allow the spiders to recognise the novel individuals as different, leading to renewed interest and a stronger behavioural response. It is the absence of a prior representation for the novel spiders that triggers this dishabituation response. Since the novel spider does not match any stored representations of the previously encountered spiders, the focal spider responds more strongly.

      The introduction of novel individuals at the end of Session 3 helps clarify that the increasing habituation observed in Session 2 and 3 is specific to familiar individuals, indicating cognitive habituation. This supports the presence of long-term memory processes in the spiders, as they can distinguish between previously encountered individuals and new ones. The habituationdishabituation paradigm thus effectively demonstrates the spiders' ability to form and reactivate encoded memory traces, providing clear evidence of recognition memory. 

      For these reasons, we are convinced that our interpretation is accurate and hope this clarification renders the additional request for an entirely new experiment unnecessary.

      (3) Lack of a functional explanation and the emphasis on 'asociality': It is entirely plausible that recognition is a pleitropic byproduct of the overall visual cognition abilities in the spiders. 

      We agree with the reviewer that it is essential to consider the broader context of individual recognition and its potential adaptive significance. The possibility that recognition in jumping spiders could be a pleiotropic byproduct of their advanced visual cognition abilities is indeed a plausible explanation and has been discussed in our manuscript.

      However, the discussion that discounts territoriality as a potential explanation is not well laid out. First, many species that are 'asocial' nevertheless defend territories. It is perhaps best to say such species are not group living, but they have social lives because they encounter conspecifics and need to interact with them.

      The reviewer also correctly points out that many 'asocial' species still defend territories and have social interactions. Our use of the term 'asocial' was meant to indicate that jumping spiders do not live in cohesive social groups, but we acknowledge that they do have social lives in terms of interactions with conspecifics. It is more accurate to describe these spiders as non-groupliving, yet socially interactive species. A better term is “non-social” to refer to the jumping spider as a species that do not live in stable social groups and do not exhibit associated behaviours, such as cooperative behaviours. This also would imply that individuals still interact with conspecifics, especially in contexts like mating, territorial disputes or aggression. We, thus, change the term from “asocial” to “non-social” in the manuscript.  

      Indeed, there are many examples of solitary living species that show the dear enemy effect, a form of individual recognition, towards familiar territorial neighbors. The authors in this case note that territorial competition is mediated by the size or color of the chelicerae (seemingly a trait that could be used to distinguish among individuals). Apparently, because previous work has suggested that territorial disputes can be mediated by a trait in the absence of familiarity has led them to discount the possibility that keeping track of the local neighbors in a potentially cannibalistic species could be a sufficient functional reason. In any event, the current evidence presented certainly does not warrant discounting that hypothesis.

      The “dear enemy effect”, where solitary living species recognize and show reduced aggression towards familiar territorial neighbors, is a relevant consideration. This effect demonstrates that individual recognition can have significant functional implications even in species that are not group-living. We will elaborate on this effect in the revised manuscript to provide a more comprehensive discussion.

      The reviewer mentioned that territorial disputes can be mediated by the size or color of the chelicerae, potentially serving as a feature for individual recognition. Our intention was not to discount the role of such traits but to highlight that the level of identity recognition we observed represents subordinate classification. This is different from the basic-level classification, such as distinguishing between male and female based on chelicerae colour. While we acknowledge that colour can be an important feature for identity discrimination, our findings suggest that individual recognition in jumping spiders goes beyond simple colour differentiation. 

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors investigated whether a salticid spider, Phidippus regius, recognizes other individuals of the same species. The authors placed each spider inside a container from which it could see another spider for 7 minutes, before having its view of the other spider occluded by an opaque barrier for 3 minutes. The spider was then either presented with the same individual again (habituation trial) or a different individual (dishabituation trial). The authors recorded the distance between the two spiders during each trial. In habituation trials, the spiders were predicted to spend more time further away from each other and, in dishabituation trials, the spiders were predicted to spend more time closer to each other. The results followed these predictions, and the authors then considered whether the spiders in habituation trials were generally fatigued instead of being habituated to the appearance of the other spider, which may have explained why they spent less time near the other individual. The authors presented the spiders with a different (novel) individual after a longer period of time (which they considered to be a long-term dishabituation trial), and found that the spiders switched to spending more time closer to the other individual again during this trial. This suggested that the spiders had recognized and had habituated to the individual that they had seen before and that they became dishabituated when they encountered a different individual.

      We appreciate the reviewer's detailed summary of our study. The reviewer's summary accurately captures the essence of our experimental design, predictions, and findings.

      Strengths:

      It is interesting to consider individual recognition by Phidippus regius. Other work on individual recognition by an invertebrate has been, for instance, known for a species of social wasp, but Phidippus regius is a different animal. Importantly and more specifically, P. regius is a salticid spider, and these spiders are known to have exceptional eyesight for animals of their size, potentially making them especially suitable for studies on individual recognition. In the current study, the results from experiments were consistent with the authors' predictions, suggesting that the spiders were recognizing each other by being habituated to individuals they had encountered before and by being dishabituated to individuals they had not encountered before. This is a good start in considering individual recognition by this species.

      We appreciate the reviewer's positive summary and acknowledgment of the strengths of our study. We would like to point out some more details: 

      While the exceptional eyesight of salticid spiders is indeed a significant factor, our study reaches deeper in terms of processing. We do not argue at the level of sensation rather than at the level of perception. Even more, identity recognition is a higher-level perceptual process. This distinction is crucial: we are not merely examining the spiders' sensory capabilities (such as good eye sight), but rather how their brains interpret and represent what they “see”. This involves a cognitive process where the sensory input (sensation) is processed and integrated into meaningful constructs (perception) and memorised in form of representations. 

      Our study also suggests that P. regius engages in “higher-level” perceptual processes. This most-likely involves complex representations of individual conspecifics, which in mammalian brains are associated with regions such as the central inferior temporal (cIT) and anterior inferior temporal (aIT) areas. We provide evidence that these spiders do not just sense visual stimuli but interpret and recognize individual identities, indicating sophisticated perceptual and cognitive abilities. In other words, the spiders do not merely respond to visual stimuli in a reflexive manner, but rather engage in sophisticated perceptual and cognitive processes that allow them to recognize and distinguish between individual identities. This indicates that the spiders are not simple Braitenberg vehicles reacting to stimuli, but are thinking organisms capable of complex mental representations. This resonates with current trends in animal cognition research, which increasingly recognize some level of consciousness and advanced cognitive abilities across a wide range of animal species. Moreover, this aligns with the growing interest and recognition of spider cognition, where research begins to provide evidence for the cognitive complexity and perceptual capabilities of these often underestimated creatures (Jackson and Cross, 2011). 

      Jackson, R. R., & Cross, F. R. (2011). Spider cognition. Advances in insect physiology, 41, 115174.

      Weaknesses:

      The experiments in this manuscript (habituation/dishabituation trials) are a good start for considering whether individuals of a salticid species recognize each other. I am left wondering, however, what features the spiders were specifically paying attention to when recognizing each other. The authors cited Sheehan and Tibbetts (2010) who stated that "Individual recognition requires individuals to uniquely identify their social partners based on phenotypic variation." Also, recognition was considered in a paper on another salticid by Tedore and Johnsen (2013).

      Tedore, C., & Johnsen, S. (2013). Pheromones exert top-down effects on visual recognition in the jumping spider Lyssomanes viridis. The Journal of Experimental Biology, 216, 1744-1756. doi: 10.1242/jeb.071118 

      In this elegant study, the authors presented spiders with manipulated images to find out what features matter to these spiders when recognizing individuals.

      The reviewer raises an important point regarding the specific features that Phidippus regius might be paying attention to when recognizing individual conspecifics. Our study indeed cited Sheehan and Tibbetts (2010) to highlight the importance of phenotypic variation in individual recognition. Additionally, we referenced the work by Tedore and Johnsen (2013) on visual recognition in another salticid species, which suggests that multiple sensory modalities, including visual and pheromonal cues, may be involved in the recognition process. While our current study focused on demonstrating that Phidippus regius can recognize individual conspecifics, we acknowledge that it does not specifically identify the phenotypic features involved in this recognition. 

      Part of the problem with using two living individuals in experiments is that the behavior of one individual can influence the behavior of the other, and this can bias the results.  

      We appreciate the reviewer's observation regarding the potential bias introduced by using two living individuals in experiments, as the behaviour of one individual can indeed influence the behaviour of the other. We shared this concern initially; however, the consistency of the data with our hypotheses suggests that this potential bias did not adversely affect the validity of our findings, rendering the concern largely illusory at least in the context of our study.

      We opted for the living-individual paradigm for the following reasons:

      There is a growing trend in ethological as well as animal cognition research towards more ecologically valid and biologically relevant settings, while simultaneously advancing the precision and quantification of the data collected. This is referred to as computational ethology.

      This approach advocates for assessing behaviour in environments that more closely resemble natural conditions, rather than relying solely on sterile and artificial experimental setups. The rationale is that such naturalistic arenas allow animals to exhibit a broader range of behaviours and interactions, providing a more accurate reflection of their cognitive and social abilities. The challenge, however, lies in navigating the inherent tradeoff between the strict control offered by standardized procedures and the ecological validity of more naturalistic interactions.

      By allowing two spiders to confront each other, we aimed to capture authentic behavioural responses while maintaining a degree of experimental standardization through the use of a controlled setup. Our approach ensures that the behaviours observed are not merely artifacts of an artificial environment but are representative of genuine social interactions. Also, to minimize potential biases arising from mutual behavioural influences, we employed a controlled and repeatable experimental environment. 

      We believe that the chosen approach provides a meaningful balance (in the above-mentioned trade-off) between ecological validity and experimental rigour. By combining a standardized environment with the naturalistic interaction of real spiders, we ensured that our findings are both scientifically robust and biologically relevant.

      However, this issue can be readily avoided because salticids are well known, for example, to be highly responsive to lures (e.g. dead prey glued in lifelike posture onto cork disks) and to computer animation. 

      While it is true that salticid spiders are responsive to lures and computer animations, we carefully considered the most appropriate and ecologically valid approach for our study. Our aim was to capture genuine behavioural patterns in a context that closely mimics the natural encounters these spiders experience.

      Additionally, creating comparable video stimuli of spiders presents its own set of challenges: Video recordings or computer animations may not fully capture the nuanced behaviours and subtle variations that occur during real-life interactions. There is also a risk that such stimuli could be perceived differently by the spiders, potentially introducing new biases or confounding factors.

      Scientific progress is not made by merely relying on previously established paradigms, especially when they may not be suitable for the specific context of a study. While alternative methods like lures or computer animations can be valuable in certain situations, our approach was deliberately chosen to best capture the naturalistic and interactive aspects of spider behaviour.

      These methods have already been successful and helpful for standardizing the different stimuli presented during many different experiments for many different salticid spiders, and they would be helpful for better understanding how Phidippus regius might recognize another individual on the basis of phenotypic variation. There are all sorts of ways in which a salticid might recognize another individual. Differences in face or body structure, or body size, or all of these, might have an important role in recognition, but we won't know what these are using the current methods alone. Also, I didn't see any details about whether body size was standardized in the current manuscript.

      As mentioned previously, the goal of our study was to demonstrate that identity recognition occurs in spiders. This alone is of significant importance, as it challenges existing assumptions about the cognitive capabilities of small-brained animals. We did not aim at providing a proximate explanation (mechanism) for identity recognition in spiders.

      The problem with what the reviewer suggested is this: As long as we do not have conclusive evidence that spiders recognize individual conspecifics, any attempt to design and manipulate stimuli would lack a solid foundation. Without understanding whether spiders have this capability, we cannot make informed decisions about which features or characteristics to manipulate in stimuli. In other words, this uncertainty means we lack a starting point for our assumptions, making it nearly impossible to create stimuli that would be useful or relevant in testing identity recognition.

      Additionally, it is nearly impossible to artificially generate a stimulus set that encompasses the natural variance in features that spiders use for visual individuation. There is no guarantee that artificial stimuli, such as lures or computer animations, would capture the relevant features that spiders use in natural interactions.

      In other words, the question how Phidippus regius recognizes another individual will be subject of further investigation. In this study, we focus on whether or not they individuate others.  

      For another perspective, my thoughts turn to a paper by Cross et al.

      Cross, F. R., Jackson, R. R., & Taylor, L. A. (2020). Influence of seeing a red face during the male-male encounters of mosquito-specialist spiders. Learning & Behavior, 48, 104-112. doi: 10.3758/s13420-020-00411-y

      These authors found that males of Evarcha culicivora, another salticid species that is known to have a red face, become less responsive to their own mirror images after having their faces painted with black eyeliner than if their faces remained red. In all instances, the spiders only saw their own mirror images and never another spider, and these results cannot be interpreted on the basis of habituation/dishabituation because the spiders were not responding differently when they simply saw their mirror image again. Instead, it was specifically the change to the spider's face which resulted in a change of behavior. The findings from this paper and from Tedore and Johnsen can help give us additional perspectives that the authors might like to consider. On the whole, I would like the authors to further consider the features that P. regius might use to discern and recognize another individual.

      We acknowledge that identifying the specific features used by P. regius for identity recognition is a valuable direction for future research. However, we must emphasise that without first establishing whether spiders are capable of individuating each other, it would be premature and challenging to determine the specific features they rely on for this process. A lack of response to certain features could either suggest that those features are not relevant or, more critically, that the spider does not recognize individual identities at all. Thus, our initial focus on demonstrating identity recognition is essential before delving into the specific cues or characteristics involved.

      While the call for addressing the proximate causation of identity recognition in jumping spiders is valid, we need to also reiterate the significance of our findings and why they stand on their own merit:

      Our study demonstrates for the first time that Phidippus regius can systematically individuate conspecifics, showing habituation within short intervals (10 minutes) and over longer intervals (1 hour). This behaviour is not due to general habituation or physical fatigue but is a result of cognitive habituation, as illustrated by the spiders' response to novel individuals introduced after repeated encounters with familiarized ones. 

      What are the implications of this? Our findings indicate that these spiders possess long-term memory and form representations that can be reactivated after an hour. While this is most-likely not fully consolidated memory formation (see our reply to Reviewer 1), it represents an encoded long-term memory. This implies that small-brained animals can remember, represent, and potentially build internal mental images, which are crucial for sophisticated cognitive processing. 

      Reviewer #3 (Public Review):

      Summary:

      Jumping spiders (family Salticidae) have extraordinarily good eyesight, but little is known about how sensitive these small animals might be to the identity of other individuals that they see. Here, experiments were carried out using Phidippus regius, a salticid spider from North America. There were three steps in the experiments; first, a spider could see another spider; then its view of the other spider was blocked; and then either the same or a different individual spider came into view. Whether it was the same or a different individual that came into view in the third step had a significant effect on how close together or far apart the spiders positioned themselves. It has been demonstrated before that salticids can discriminate between familiar and unfamiliar individuals while relying on chemical cues, but this new research on P. regius provides the first experimental evidence that a spider can discriminate by sight between familiar and unfamiliar individuals.

      Clark RJ, Jackson RR (1995) Araneophagic jumping spiders discriminate between the draglines of familiar and unfamiliar conspecifics. Ethology, Ecology and Evolution 7:185-190

      We appreciate the reviewer's comprehensive summary and acknowledgment of the significance of our findings.

      Strengths:

      This work is a useful step toward a fuller understanding of the perceptual and cognitive capacities of spiders and other animals with small nervous systems. By providing experimental evidence for a conclusion that a spider can, by sight, discriminate between familiar and unfamiliar individuals, this research will be an important milestone. We can anticipate a substantial influence on future research.

      We appreciate the reviewer’s recognition of the strengths and significance of our study. We are pleased that the reviewer considers our research an important milestone. Our findings indeed suggest that even animals with relatively simple nervous systems can perform complex cognitive tasks, which has substantial implications for the broader study of animal cognition.

      As pointed out by the reviewer, we also hope that our study will have a substantial influence on future research. By establishing a methodology and providing clear evidence of visual discrimination, we aim to encourage further investigations into the cognitive abilities of jumping spiders and other arthropods. Future research can build on our findings to explore the specific visual cues and mechanisms involved in individual recognition (as Reviewer 2 pointed out), as well as the ecological and evolutionary implications of these abilities.

      Weaknesses:

      (1) The conclusions should be stated more carefully.

      We agree that clarity in our conclusions is paramount. We will revise the manuscript to ensure that our conclusions are presented with precision and appropriately reflect the data. Specifically, we will emphasize the evidence supporting our findings of visual individual recognition and clarify the limitations and scope of our conclusions to avoid any potential overstatements.

      (2) It is not clearly the case that the experimental methods are based on 'habituation (learning to ignore; learning not to respond). Saying 'habituation' seems to imply that certain distances are instances of responding and other distances are instances of not responding but, as a reasonable alternative, we might call distance in all instances a response. However, whether all distances are responses or not is a distracting issue because being based on habituation is not a necessity.

      We appreciate the reviewer's feedback and understand the concern regarding the use of the term 'habituation.' We agree that all distances maintained by the spiders are active responses and reflect their behavioral decisions based on perception and recognition of the other individual. We recognize that all distances are responses and interpret these as the spiders’ “active decisions”, modulated by their recognition of the same or different individuals. 

      The terms 'habituation' and 'dishabituation' are used to label trial types for ease of discussion and to describe the expected behavioural modulation.

      (3) Besides data related to distances, other data might have been useful. For example, salticids are especially well known for the way they communicate using distinctive visual displays and, unlike distance, displaying is a discrete, unambiguous response.

      We appreciate the reviewer’s suggestion to incorporate data on visual displays, which are indeed well-known communication methods among salticids. We agree that visual displays are discrete and unambiguous responses that could provide additional insights into the spiders' recognition abilities.

      Our primary focus on distance measurements was driven by the need to quantify behaviour in a continuous and scalable manner, that is, how spiders modulate their proximity based on familiarity with other individuals.

      We acknowledge the potential value of including visual display measurments; however, in our study, we aimed to establish a foundational understanding of recognition behaviour through proximity measures first. Also, capturing diplays requires a different experimental paradigm, where the displays are clearly visible and analyzable. 

      (4) Methods more aligned with salticids having extraordinarily good eyesight would be useful. For example, with salticids, standardising and manipulating stimuli in experiments can be achieved by using mounts, video playback, and computer-generated animation.

      There is no doubt that salticids have excellent eyesight. However, our study focuses on higherlevel perceptual processes that require complex brain analysis, not just visual acuity. The goal was to investigate whether spiders can individuate and recognize conspecifics, which involves interpreting visual information and forming long-term representations.

      Clearly, methods like video playback and computer animations are useful in controlled settings, where the spider is mounted, but they pose challenges for our specific research question. At this stage of research, we lack precise knowledge of which visual features are critical for individual recognition in spiders, making it difficult to design effective artificial stimuli. 

      Our primary objective was to determine if spiders can individuate others. Before exploring the proximate mechanisms of how they individuate others, it was essential to establish that they have this capability. This foundational question needed to be addressed before delving into more detailed mechanistic studies.

      (5) An asocial-versus-social distinction is too imprecise, and it may have been emphasised too much. With P. regius, irrespective of whether we use the label asocial or social, the important question pertains to the frequency of encounters between the same individuals and the consequences of these encounters.

      Our intent was to convey that P. regius does not live in cohesive social groups but does engage in individual interactions that can have significant behavioral consequences. We will revise the manuscript to reduce the emphasis on the asocial-versus-social distinction. As discussed above, we also will change the term “asocial” to “non-social” in the manuscript.

      (6) Hypotheses related to not-so-strictly adaptive factors are discussed and these hypotheses are interesting, but these considerations are not necessarily incompatible with more strictly adaptive influences being relevant as well.

      We appreciate the reviewer's observation regarding the discussion of hypotheses related to notso-strictly adaptive factors. We agree that our considerations of these factors do not preclude the relevance of more strictly adaptive influences.

      We will revise the manuscript to explicitly discuss how our findings can be interpreted in the context of adaptive hypotheses. This will provide a more comprehensive understanding of the evolutionary significance of individual recognition in P. regius. Modifications were made in the Discussion section.

      In the following, we comment on issues not mentioned in the “public reviews” section.

      Reviewer #1 (Recommendations For The Authors):

      (1) I would suggest conducting experiments that actually test for recognition memory, as this seems to be a claim that the authors make. Following the ant studies by Dreier cited in this manuscript would be sufficient to test for memory. Given the relative simplicity of the measures being taken (location of spiders), this would seem like a very simple addition that would provide a much stronger and more readily interpreted dataset.

      As previously explained in our detailed responses (public reviews), we believe that the current design effectively addresses the questions at hand. Our approach, using a habituationdishabituation paradigm, provides robust evidence for recognition memory within the framework of early long-term memory.

      Additionally, we have explained why using the distance to the panel as a measure is not appropriate in this context. Specifically, using such a measure can misrepresent the actual interests of the spiders in each other.

      While we acknowledge the merits of the ant studies by Dreier, our current design allows for a detailed understanding of the spiders' recognition capabilities over short (10 min) and slightly longer intervals (up to one hour). This is sufficient to demonstrate the presence of recognition memory without the necessity of further experiments. The observed patterns of habituation and dishabituation responses in our study clearly indicate that the spiders can distinguish between familiar and novel individuals, which supports our claims.

      Given these points, we respectfully maintain that the current data and experimental design are adequate to support our findings and provide a comprehensive understanding of recognition memory in Phidippus regius.

      (2) The writing is rather impenetrable. The results explain the basic finding in terms of statistical variables rather than simply stating the results. A clear and straightforward statement such as 'the spiders showed reduced interest upon habituation trials, indicating xyz' (and then citing the stats) is preferable to the introduction of results as a statistical model. The statistical model is a means of assessing the results. It is not the result. Describe the data.

      We tried to improve that in the current version.

      (3) Showing more straightforward data such as distance from the joint barrier would make the paper much easier to understand.

      This paper has been on bioRxiv for some time and my guess is that it has ended up here because it is having trouble in review. Collecting new data that more directly test the question at hand, presenting the data in a more direct manner, and more critically evaluating your own claims will improve the paper.

      While it is true that the paper has been on bioRxiv for a while, this submission marks the first instance where it has undergone peer review. Prior to this, the manuscript was submitted to other journals but was not reviewed.

      We hope the explanations provided in the “public reviews” section, along with the revised manuscript, sufficiently clarify our study and its conclusions. We believe the current data robustly address the research questions, and as outlined in our detailed responses, we have critically evaluated our claims and presented the data clearly. Given these clarifications, we do not see the necessity for new experiments as the existing data adequately support our findings. We trust that these revisions and explanations will clarify any misunderstandings.

      I am totally sold that the spiders are paying attention to identity at some level. The key now is to understand what that actually means in terms of recognition (i.e. memory of individuals) not just habituation.

      We appreciate the reviewer’s emphasis on the distinction between habituation and memorybased individual recognition. As detailed in the preceding discussion, we have taken great care to clarify how our paradigm distinguishes simple habituation effects from true memory for individual identity. We trust that the preceding sections make clear how our findings go beyond simple habituation to establish genuine individual recognition.

      Reviewer #2 (Recommendations For The Authors):

      Aside from the comments in the public review, I have some additional comments that the authors may wish to consider.

      Numerous times in the manuscript, the authors mentioned that recognizing individuals requires recognition memory. This seems rather obvious, and I wonder if the authors could instead be more precise about what they mean by 'recognition memory'?

      Recognition memory refers to the cognitive ability to identify a previously encountered stimulus, an individual, or events as familiar. It involves both encoding and retrieval processes, allowing an organism to distinguish between novel and familiar stimuli. This form of memory is a fundamental component of cognitive functioning and is supported by neural mechanisms that, in the mammal brain, involve the hippocampus and other brain regions associated with memory processing. 

      In our study, we aimed to test whether Phidippus regius recognizes conspecifics, or, in other words, utilizes recognition memory to distinguish between familiar and unfamiliar conspecifics. With the habituation - dishabituation paradigm, we assessed the spiders' ability to recognize previously encountered individuals and demonstrate memory retention over short (10 min) and extended periods (1 hour).

      Encoding: In the initial trial, when a spider encounters an individual for the first time (Figure 1A, “Baseline” or “Dishabituation” for every following trial), it encodes the visual information related to that specific individual. This encoding process involves creating a memory trace of the individual's phenotypic characteristics.

      Storage: During the visual separation period, this encoded information is stored in the spider's memory system. The memory trace, though initially fragile, starts to stabilize over the separation period. Whether or not this leads to some form of consolidated memory remains unaddressed. This aspect was highlighted by the first reviewer, but our focus is on the early process rather than on late processes, such as consolidation. 

      Retrieval: In the subsequent trial, when the same individual is presented again, the spider retrieves the stored memory trace. If the spider recognizes the individual, its behaviour reflects habituation, indicating memory retrieval. Conversely, when a novel individual is introduced, the lack of stored memory trace triggers a different behavioural response, indicating dishabituation. This differential response demonstrates the spider's ability to distinguish between familiar and unfamiliar individuals. This differential response is also key to understanding the nature of habituation over the three sessions, as introducing novel spiders leads to a significant dishabituation response after the three sessions in Experiment 2.

      In Line 39, the authors state that they used "a naturalistic experimental procedure". I would like to know how this experiment is 'naturalistic'. The authors' use of an arena does not appear naturalistic, or something the spiders would encounter in the wild.

      We appreciate the reviewer's comment regarding our use of the term 'naturalistic'. We acknowledge that the experimental arena itself does not replicate the conditions found in the wild. Our approach aimed to incorporate elements of natural behaviour by allowing two spiders to freely move and interact within the controlled environment. This approach aligns with principles from computational ethology, which seeks to balance the trade-off between repeatability/standardization and observing free, naturalistic behaviour. By using this paradigm, we aimed to capture behaviours that closely resemble those exhibited in their natural habitat. This setup was chosen to balance the need for ecological validity with the requirements for standardized data collection. 

      Also, and this point has been raised above, by observing the spiders' natural interactions without restraining them or using artificial stimuli like computer animations, we aimed to capture behaviours that closely resemble their natural responses to conspecifics. In contrast, we would not have any clear expectations regarding responses to arbitrarily designed artificial stimuli. This method provides a more ecologically valid assessment of the spiders' recognition abilities.

      There are a few details wrong in Line 41. 'Salticidae' is a family name and shouldn't be italicized. Also, the sentence suggests that there is a spider called a 'jumping spider' in the family Salticidae, which is technically called Phidippus regius. To clarify, all spiders in the family Salticidae are known as jumping spiders, and one species of jumping spiders is called Phidippus regius.

      We will correct this in the manuscript to accurately reflect the classification and terminology. Thank you for pointing out these inaccuracies.

      A manuscript on individual recognition by a salticid should include citations to earlier papers that have already considered individual recognition by salticids. As well as the paper by Tedore and Johnsen (2013), the authors should be aware of the following papers.

      Clark, R. J., & Jackson, R. R. (1994). Portia labiata, a cannibalistic jumping spider, discriminates between its own and foreign egg sacs. International Journal of Comparative Psychology, 7, 3843.

      Clark, R. J., & Jackson, R. R. (1994). Self-recognition in a jumping spider: Portia labiata females discriminate between their own draglines and those of conspecifics. Ethology, Ecology & Evolution, 6, 371-375.

      Clark, R. J., & Jackson, R. R. (1995). Araneophagic jumping spiders discriminate between the draglines of familiar and unfamiliar conspecifics. Ethology, Ecology & Evolution, 7, 185-190.

      We appreciate the reviewer's suggestion to include citations to these earlier papers. We will add the recommended references to provide a comprehensive background.

      In Line 203, I would not consider "interaction with human caretakers and experimenters" to be a form of behavioral enrichment. This kind of interaction has the potential to be stressful for the spiders, rather than enriching. I suggest deleting that part of the sentence.

      We appreciate the reviewer's feedback and agree that interactions with human caretakers and experimenters might not always be enriching and could potentially be stressful for the spiders. We will remove that part of the sentence to better reflect the intended meaning.

      Reviewer #3 (Recommendations For The Authors):

      This manuscript is useful and interesting, and I predict that it will be influential, but more attention should be given to stating the objective and conclusion accurately and clearly. As I understand it, the objective was to investigate a specific hypothesis: that Phidippus regius has a capacity to identify conspecific individuals as particular individuals (i.e., individual identification). Strong evidence supporting this hypothesis being true would be especially remarkable because I am unaware of any published work having shown evidence of a spider expressing this specific perceptual capacity.

      Thank you for recognizing the significance and potential influence of our manuscript. We agree that clearly stating the objective and conclusions is essential for conveying the importance of our findings. Our results provide robust evidence supporting the hypothesis that Phidippus regius can recognize and remember individual conspecifics. We will revise the manuscript to more clearly highlight the objective and our conclusions, emphasizing the novel evidence for individual identification in these spiders.

      Based on reading this manuscript and based on my understanding of the meaning of 'individual identification', it seems to me that the hypothesis that P. regius has a capacity for individual identification might or might not be true, and the experiments in this manuscript cannot tell us which is the case. 

      We respectfully disagree with the reviewer's assessment. Our experiments were carefully designed to test whether P. regius has the capacity for individual identification, and our results provide clear evidence supporting this hypothesis. The systematic differences in the spiders' behaviour when encountering familiar versus novel individuals indicate that they can recognize and remember specific conspecifics. We will revise the manuscript to ensure that the evidence and conclusions are stated more clearly to address any potential misunderstandings.

      Determining which is the case would have required research that made better use of the literature, and displayed more critical thinking. addressed credible alternative hypotheses and adopted experimental methods that focused more strictly on individual identification. 

      The distinction between whether P. regius has a capacity for individual identification is not ambiguous in our study. Our findings clearly demonstrate this capacity through systematic behavioural responses to familiar versus novel individuals. As pointed out above, the experimental procedure might be complex, but results are systematic despite this complexity. The experiments were designed to directly address the hypothesis of individual identification, and the data robustly support our conclusions. While considering alternative hypotheses is important, the results we present provide a coherent and compelling case for individual identification in P. regius. We will ensure our manuscript clearly articulates this narrative and the supporting evidence.

      At the same time, I also appreciate that asking for all of that at once would be asking for too much. As I see it, this manuscript tells us about research that moves us closer to a clear focus on the details and questions that will matter in the context of considering a hypothesis that is strictly about individual identification. More importantly, I think this research reveals a perceptual capacity that is remarkable even if it is not strictly a capacity for individual identification.

      We understand the desire for a more focused exploration of individual identification with paradigms more familiar to the reviewers and we acknowledge that further detailed studies could enhance our understanding of this capacity. However, our findings do indeed suggest that Phidippus regius exhibits a remarkable perceptual capacity for recognizing and remembering individual conspecifics. The systematic behavioural responses observed in our experiments strongly indicate that these spiders possess the ability for individual recognition. While our study may not have explored every potential detail (e.g. which features are most crucial for the memory matching processes), the evidence we present robustly supports the conclusion of individual identification.

      We acknowledge that it is indeed valuable to follow established paradigms and build upon the frameworks that have been used successfully in similar species and studies. These paradigms provide a solid foundation for scientific inquiry and allow for comparability across different research efforts. However, it is equally important to acknowledge and explore alternative approaches. Scientific progress is driven not only by replication but also by innovation. By employing new paradigms, researchers can uncover novel insights and push the boundaries of current understanding. The paradigm we used in our study, while different from those traditionally applied to similar research, is not an invention but a well-established method in various domains. It represents an innovative application in the context of our specific research questions, offering a fresh perspective and contributing to the advancement of the field.

      As I understand it, 'individual identification' means identifying another individual as being a particular individual instead of a member of a larger set (or 'class') of individuals. An 'individual' is a set containing a single individual. Interesting examples of identifying members of larger sets include discriminating between familiar and unfamiliar individuals. In the context of the specific experiments in this manuscript, familiar-unfamiliar discrimination means discriminating between recently-seen and not-so-recently-seen individuals. My impression is that the experiments in this manuscript have given us a basis for concluding that P. regius has a capacity for familiarunfamiliar (recently seen versus not so recently seen) discrimination. If this is the case, then I think this is the conclusion that should be emphasised. This would be an important conclusion.

      I appreciate that, depending on how we use the words, familiar-unfamiliar discrimination might be construed as being 'individual identification'. An individual is identified as 'the individual recently seen'. As a casual way of speaking, it can be reasonable to call this 'individual identification'. The difficulty comes from the way calling this 'individual identification' can suggest something more than has been demonstrated. To navigate through this difficulty, we need an expression to use for a capacity that goes beyond familiar-unfamiliar discrimination. In the context of this manuscript about P. regius, we need expressions that will make it easy to consider two things. One of these things is a capacity for familiar-unfamiliar discrimination. The other is the capacity to identify another individual as being a particular individual.

      We appreciate the reviewer's insightful comments on the distinction between familiar-unfamiliar discrimination and individual identity recognition. Our study indeed focuses on demonstrating that Phidippus regius can recognize and remember individual conspecifics, providing evidence for individual identity recognition.

      Two specific behavioural hallmarks that speak against familiarity recognition:

      First, the significant dishabituation response to novel individuals introduced after multiple sessions underscores the specificity of the recognition. This shows that the spiders' habituation is not general but specific to familiar individuals. 

      Second, the pattern of habituation over the sessions provides further evidence: We observed the strongest systematic modulation in Session 1, a reduced modulation in Session 2, and a further diminished effect in Session 3. If the spiders were only responding based on familiarity, we would expect a more drastic decrease, resulting in a washed-out non-effect by Session 2. However, the continued, though diminishing, differentiation between habituation and dishabituation trials across sessions indicates that the spiders are not merely responding to a general sense of familiarity but are engaging in individual recognition. In other words, the spiders' ability to distinguish between familiar and novel individuals even after repeated exposures suggests that they are not just recognizing a familiar status but are identifying specific individuals.

      Things people do might help clarify what this means. People have an extraordinary capacity for identifying other individuals as particular individuals. Often this is based on giving each other names. Imagine we are letting somebody see photographs and asking them to identify who they see. The answer might be, 'somebody familiar' or 'somebody I saw recently' (familiar-unfamiliar discrimination); or the question might be answered by naming a particular individual (individual identification).

      We appreciate the reviewer's efforts to clarify the distinction between familiar-unfamiliar discrimination and individual recognition using human examples. However, we believe this comparison might not fully capture the complexity of individual recognition in non-human animals. 

      Familiarity recognition refers to recognizing someone as having been seen or encountered before without necessarily distinguishing them from others in the same category. On the other hand, identity recognition involves recognizing a specific individual based on unique characteristics (or features). In humans, this often involves naming, but more critically, like in most animals, it involves recognizing visual, auditory, chemical or other sensory cues. In animals, including spiders, individual recognition does not involve and let alone rely on naming but on the ability to distinguish between individuals based on sensory cues and learnt associations. This is a valid and well-documented form of individual recognition across many species.

      Individual recognition does not require naming or the assignment of a referential label. Animals can distinguish between specific individuals based on previously perceived and stored features and characteristics. Naming is the exception rather than the rule in the animal kingdom. Only a few species, such as humans and maybe certain cetaceans, use naming for identity recognition. This is an evolutionary rarity and not the standard mechanism for individual recognition, which primarily relies on sensory cues and learnt associations. Furthermore, the mechanism of recognition in both humans and animals involves a complex process of matching incoming sensory and perceptual information with stored memory representations. Naming is merely a tool for communication, allowing us to convey which individual we are referring to. It is not the mechanism by which recognition occurs. The core of individual recognition is this matching process, where sensory cues (visual, auditory, chemical, etc.) are compared to memory traces of previously encountered individuals. Therefore, the suggestion that individual identification necessitates naming misrepresents the actual cognitive processes involved. 

      We can think of individual identification being based on more fine-grained discrimination (with this, set size = one), with familiar-unfamiliar discrimination being more coarse-grained discrimination (with this, set size can be more than one). Restricting the expression 'individual identification' to instances of having the capacity to identify another individual as being a particular individual (set size = one) is better aligned with normal usage of this expression.

      Absolutely, the distinction between fine-grained and coarse-grained discrimination aligns with the concept of different category levels, such as basic and subordinate levels, put forward by Eleanor Rosch (e.g. Rosch, 1973). In the context of individual recognition, fine-grained discrimination (where set size = one) refers to the ability to identify a specific individual based on unique characteristics. This is referred to as subordinate level categorization. Coarse-grained discrimination (where set size can be more than one) refers to recognizing someone as familiar without distinguishing them from others in the same category, more similar to basic level categorization. 

      Rosch, E.H. (1973). "Natural categories". Cognitive Psychology. 4 (3): 328–50.doi:10.1016/0010-0285(73)90017-0

      There is a strong emphasis on an asocial-social distinction in this manuscript. It seems to me that this needs to be focused more clearly on the specific factors that would make a capacity for individual identification beneficial. In the context of this manuscript, the term 'social' may suggest too much. It seems to me that the issue that matters the most is whether individuals live in situations where important encounters occur frequently between the same individuals. Irrespective of whether other notions of the meaning of 'social' also apply, there are salticids that live in aggregated situations where they frequently have important encounters with each other. This is the case with Phidippus regius in the field in Florida, but I realize that there may not be much published information about the natural history of this salticid. Even so, there are salticids to which the word 'social' has been applied in published literature.

      We appreciate the reviewer's comments on the asocial-social distinction and we agree that this terminology might need refinement. Our intent was not to categorize Phidippus regius rigidly but to explore the contextual factors influencing the benefits of individual identification. The critical factor in our study is indeed the frequency and importance of encounters between individuals, rather than a broader social structure. We will revise the manuscript to reflect this more nuanced perspective, focusing on the ecological validity of our experimental design and the adaptive significance of individual recognition in environments where repeated encounters can occur.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      The authors observed a decline in autophagy and proteasome activity in the context of Milton knockdown. Through proteomic analysis, they identified an increase in the protein levels of eIF2β, subsequently pinpointing a novel interaction within eIF subunits where eIF2β contributes to the reduction of eIF2α phosphorylation levels. Furthermore, they demonstrated that overexpression of eIF2β suppresses autophagy and leads to diminished motor function. It was also shown that in a heterozygous mutant background of eIF2β, Milton knockdown could be rescued. This work represents a novel and significant contribution to the field, revealing for the first time that the loss of mitochondria from axons can lead to impaired autophagy function via eIF2β, potentially influencing the acceleration of aging.

      Thank you so much for your review and comments.

      Reviewer #2 (Public Review):

      In the manuscript, the authors aimed to elucidate the molecular mechanism that explains neurodegeneration caused by the depletion of axonal mitochondria. In Drosophila, starting with siRNA depletion of Milton and Miro, the authors attempted to demonstrate that the depletion of axonal mitochondria induces the defect in autophagy. From proteome analyses, the authors hypothesized that autophagy is impacted by the abundance of eIF2β and the phosphorylation of eIF2α. The authors followed up the proteome analyses by testing the effects of eIF2β overexpression and depletion on autophagy. With the results from those experiments, the authors proposed a novel role of eIF2β in proteostasis that underlies neurodegeneration derived from the depletion of axonal mitochondria.

      The manuscript has several weaknesses. The reader should take extra care while reading this manuscript and when acknowledging the findings and the model in this manuscript.

      The defect in autophagy by the depletion of axonal mitochondria is one of the main claims in the paper. The authors should work more on describing their results of LC3-II/LC3-I ratio, as there are multiple ways to interpret the LC3 blotting for the autophagy assessment. Lysosomal defects result in the accumulation of LC3-II thus the LC3-II/LC3-I ratio gets higher. On the other hand, the defect in the early steps of autophagosome formation could result in a lower LC3-II/LC3-I ratio. From the results of the actual blotting, the LC3-I abundance is the source of the major difference for all conditions (Milton RNAi and eIF2β overexpression and depletion).

      Thank you so much for your review and comments. As the reviewer pointed out, LC3-II/LC3- I ratio changes do not necessarily indicate autophagy defects. However, since p62 accumulation (Figure 2B, 2E, 3E, Figure 8C, Figure 9C), these results collectively suggest that autophagy is lowered.

      As the reviewer pointed out and we described in v2, milton knockdown, eIF2β overexpression and heterozygosity increase LC3-I abundance. We do not know how these conditions increase LC3-I at this moment. We will investigate the cause of the increase in LC3-I by milton knockdown and how it contribute to impaired autophagy. We added this discussion as:

      Lines 388-393; ‘Our results also suggest that milton knockdown and overexpression of eIF2β affect autophagy via increased LC3-I abundance (Figures 2 and 7), suggesting an unconventional mechanism of autophagy suppression. To our knowledge, the roles of eIF2β in aging and autophagy independent of ISR have not been reported. Our results revealed a novel function of eIF2β to maintain proteostasis during aging, while further investigation is required to elucidate underlying mechanisms.’

      Another main point of the paper is the up-regulation of eIF2β by depleting the axonal mitochondria leads to the proteostasis crisis. This claim is formed by the findings from the proteome analyses. The authors should have presented their proteomic data with much thorough presentation and explanation. As in the experiment scheme shown in Figure 4A, the author did two proteome analyses: one from the 7-day-old sample and the other from the 21-day-old sample. The manuscript only shows a plot of the result from the 7-day-old sample, but that of the result from the 21-day-old sample. For the 21-day-old sample, the authors only provided data in the supplemental table, in which the abundance ratio of eIF2β from the 21-day-old sample is 0.753, meaning eIF2β is depleted in the 21-day-old sample. The authors should have explained the impact of the eIF2β depletion in the 21-day-old sample, so the reader could fully understand the authors' interpretation of the role of eIF2β on proteostasis.

      Thank you for pointing it out. Plots of the 21-day-old proteome results was included in the main figure (Figure 4C) in v2. In this revision, we further analyzed age-dependent changes of eIF2β levels by western blotting (Figure 4G). We found that eIF2β levels increased during aging until 49-day-old then reduced at 63-day-old (Figure 4G in the revised manuscript). At the young age, eIF2β levels were higher in milton knockdown brain compared to the control , and eIF2β levels were lower in milton knockdown brains than those in the control. These results suggest that milton knockdown accelerates age-dependent changes in eIF2β. We added these results and discussion in the revised manuscript.

      Lines 240-243: ‘We also investigated age-dependent changes in eIF2β by western blotting of control flies at 7-, 21-, 35-, and 49-, and 63-day-old. eIF2β levels increased during aging until 49-day-old (Figure 4G). These results suggest that upregulation of eIF2β in milton knockdown fly brain reflects early an onset of age-dependent increase of eIF2β levels.’

      Lines 363-368: ‘We also found that eIF2β protein levels increase in an age-dependent manner until 49-day-old and reduces after that (Figure 4G). In the brains with neuronal knockdown of milton, eIF2β levels were higher at 7-day-old than those in control and lower at the 21-day-old (Figure 4D and Supplementary table). These results suggest that milton knockdown is likely accelerating age-dependent changes rather than increasing their magnitude.’Our new data indicate that eIF2β levels increase during aging in control flies until 49-day-old, then reduce at 63-day-old (included as Figure 4G in the revised manuscript). These age- dependent changes might explain the reduction in eIF2β levels in Milton knockdown compared to the control in middle age: higher eIF2β levels in milton knockdown flies at a young age than control and lower eIF2β levels in the middle-aged flies may reflect premature aging.

      We included these sentences in the discussion section:

      Lines 240-243:‘We also investigated age-dependent changes in eIF2β by western blotting of control flies at 7-, 21-, 35-, and 49-, and 63-day-old. eIF2β levels increased during aging until 49-day-old (Figure 4G). These results suggest that upregulation of eIF2β in milton knockdown fly brain reflects early an onset of age-dependent increase of eIF2β levels.’

      Lines 359-371: ‘Our results suggest that the loss of axonal mitochondria is an event upstream of proteostasis collapse during aging. The number of puncta of ubiquitinated proteins was higher in milton knockdown at 14-day-old, but there was no significant difference at 30-day-old (Figure 1). Proteome analyses also showed that age-related pathways, such as immune responses, are enhanced in young flies with milton knockdown (Table 2). We also found that eIF2β protein levels increase in an age-dependent manner until 49-day-old and reduces after that (Figure 4G). In the brains with neuronal knockdown of milton, eIF2β levels were higher at 7-day-old than those in control and lower at the 21-day-old (Figure 4D and Supplementary table). These results suggest that milton knockdown is likely accelerating age-dependent changes rather than increasing their magnitude. Disruption of proteostasis is expected to contribute neurodegeneration38 , and it would be interesting to analyze the sequence of protein accumulation and axonal degeneration in milton knockdown (24,29 and Figure 1) in detail with higher time resolution.’


      With our new data, we revised some of our responses to the first round of reviewer’s comments.

      Reviewer #1 (Public Review):

      The authors observed a decline in autophagy and proteasome activity in the context of Milton knockdown. Through proteomic analysis, they identified an increase in the protein levels of eIF2β, subsequently pinpointing a novel interaction within eIF subunits where eIF2β contributes to the reduction of eIF2α phosphorylation levels. Furthermore, they demonstrated that overexpression of eIF2β suppresses autophagy and leads to diminished motor function. It was also shown that in a heterozygous mutant background of eIF2β, Milton knockdown could be rescued. This work represents a novel and significant contribution to the field, revealing for the first time that the loss of mitochondria from axons can lead to impaired autophagy function via eIF2β, potentially influencing the acceleration of aging. To further support the authors' claims, several improvements are necessary, particularly in the methods of quantification and the points that should be demonstrated quantitatively. It is crucial to investigate the correlation between aging and the proteins eIF2β and eIF2α.

      Thank you so much for your review and comments. We included analyses of protein levels of eIF2α, eIF2β, and eIF2γ at 7 days and 21 days (Figure 4D). The manuscript was revised as below;

      Lines 246-249 ‘As for the other subunits of eIF2 complex, proteome analysis did not detect a significant difference in the protein levels of eIF2α and eIF2γ between milton knockdown and control flies at 7 and 21 days (Figure 4D).’

      NEW TEXT: We analyzed age-dependent changes of eIF2β levels in more detail by western blotting (Figure 4G). We found that eIF2β levels increased during aging until 49-day-old then reduced at 63-day-old (Figure 4G in the revised manuscript). At the young age, eIF2β levels were higher in milton knockdown brain compared to the control , and eIF2β levels were lower in milton knockdown brains than those in the control. These results suggest that Milton knockdown accelerates age-dependent changes in eIF2β.. We added these results and discussion in the revised manuscript.

      NEW TEXT: Lines 240-243: ‘We also investigated age-dependent changes in eIF2β by western blotting of control flies at 7-, 21-, 35-, and 49-, and 63-day-old. eIF2β levels increased during aging until 49-day-old (Figure 4G). These results suggest that upregulation of eIF2β in milton knockdown fly brain reflects early an onset of age-dependent increase of eIF2β levels.’

      NEW TEXT: Lines 363-368: ‘We also found that eIF2β protein levels increase in an age-dependent manner until 49-day-old and reduces after that (Figure 4G). In the brains with neuronal knockdown of milton, eIF2β levels were higher at 7-day-old than those in control and lower at the 21-day-old (Figure 4D and Supplementary table). These results suggest that milton knockdown is likely accelerating age-dependent changes rather than increasing their magnitude.’

      Reviewer #2 (Public Review):

      In the manuscript, the authors aimed to elucidate the molecular mechanism that explains neurodegeneration caused by the depletion of axonal mitochondria. In Drosophila, starting with siRNA depletion of Milton and Miro, the authors attempted to demonstrate that the depletion of axonal mitochondria induces the defect in autophagy. From proteome analyses, the authors hypothesized that autophagy is impacted by the abundance of eIF2β and the phosphorylation of eIF2α. The authors followed up the proteome analyses by testing the effects of eIF2β overexpression and depletion on autophagy. With the results from those experiments, the authors proposed a novel role of eIF2β in proteostasis that underlies neurodegeneration derived from the depletion of axonal mitochondria.

      The manuscript has several weaknesses. The reader should take extra care while reading this manuscript and when acknowledging the findings and the model in this manuscript.

      The defect in autophagy by the depletion of axonal mitochondria is one of the main claims in the paper. The authors should work more on describing their results of LC3-II/LC3-I ratio, as there are multiple ways to interpret the LC3 blotting for the autophagy assessment. Lysosomal defects result in the accumulation of LC3-II thus the LC3-II/LC3-I ratio gets higher. On the other hand, the defect in the early steps of autophagosome formation could result in a lower LC3-II/LC3-I ratio. From the results of the actual blotting, the LC3-I abundance is the source of the major difference for all conditions (Milton RNAi and eIF2β overexpression and depletion). In the text, the authors simply state the observation of their LC3 blotting. The manuscript lacks an explanation of how to evaluate the LC3-II/LC3-I ratio. Also, the manuscript lacks an elaboration on what the results of the LC3 blotting indicate about the state of autophagy by the depletion of axonal mitochondria.

      Thank you for pointing it out, and we apologize for an insufficient description of the result. We included quantitation of the levels of LC3-I and LC3-II in Figures 2A, 2D, 3D, 7B (Figure 6B in the previous version), and 8B (Figure 7B in the previous version). As the reviewer pointed out, LC3-II/LC3-I ratio changes do not necessarily indicate autophagy defects. However, since p62 accumulation (Figure 2B, 2E, 3E, 7C (Figure 6C in the previous version), 8C (Figure 7C in the previous version)), these results collectively suggest that autophagy is lowered. We revised the manuscript to include this discussion as below:

      Lines 174-186 ‘During autophagy progression, LC3 is conjugated with phosphatidylethanolamine to form LC3-II, which localizes to isolation membranes and autophagosomes. LC3-I accumulation occurs when autophagosome formation is impaired, and LC3-II accumulation is associated with lysosomal defects31,32. p62 is an autophagy substrate, and its accumulation suggests autophagic defects31,32. We found that milton knockdown increased LC3-I, and the LC3-II/LC3-I ratio was lower in milton knockdown flies than in control flies at 14-day-old (Figure 2A). We also analyzed p62 levels in head lysates sequentially extracted using detergents with different stringencies (1% Triton X-100 and 2% SDS). Western blotting revealed that p62 levels were increased in the brains of 14-day-old of milton knockdown flies (Figure 2B). The increase in the p62 level was significant in the Triton X-100- soluble fraction but not in the SDS-soluble fraction (Figure 2B), suggesting that depletion of axonal mitochondria impairs the degradation of less-aggregated proteins.’

      Line 189-190: 'At 30 day-old, LC3-I was still higher, and the LC3-II/LC3-I ratio was lower, in milton knockdown compared to the control (Figure 2D).’

      Line 202-203: ‘However, in contrast with milton knockdown, Pfk knockdown did not affect the levels of LC3-I, LC3-II or the LC3-II/LC3-I ratio (Figure 3D).’

      Line 279-285: ‘Neuronal overexpression of eIF2β increased LC3-II, while the LC3-II/LC3-I ratio was not significantly different (Figure 7A and B). Overexpression of eIF2β significantly increased the p62 level in the Triton X-100-soluble fraction (Figure 7C, 4-fold vs. control, p <0.005 (1% Triton X-100)) but not in the SDS-soluble fraction (Figure 7C, 2-fold vs. control, p\= 0.062 (2% SDS)), as observed in brains of milton knockdown flies (Figure 2B). These data suggest that neuronal overexpression of eIF2β accumulates autophagic substrates.’

      Line 311-319: ‘Neuronal knockdown of milton causes accumulation of autophagic substrate p62 in the Triton X-100-soluble fraction (Figure 2B), and we tested if lowering eIF2β ameliorates it. We found that eIF2β heterozygosity caused a mild increase in LC3-I levels and decreases in LC3-II levels, resulting in a significantly lower LC3-II/LC3-I ratio in milton knockdown flies (Figure 8B). eIF2β heterozygosity decreased the p62 level in the Triton X- 100-soluble fraction in the brains of milton knockdown flies (Figure 8C). The p62 level in the SDS-soluble fraction, which is not sensitive to milton knockdown (Figure 2B), was not affected (Figure 8C). These results suggest that suppression of eIF2β ameliorates the impairment of autophagy caused by milton knockdown.’

      Another main point of the paper is the up-regulation of eIF2β by depleting the axonal mitochondria leads to the proteostasis crisis. This claim is formed by the findings from the proteome analyses. The authors should have presented their proteomic data with much thorough presentation and explanation. As in the experiment scheme shown in Figure 4A, the author did two proteome analyses: one from the 7-day-old sample and the other from the 21-day-old sample. The manuscript only shows a plot of the result from the 7-day-old sample, but that of the result from the 21-day-old sample. For the 21-day-old sample, the authors only provided data in the supplemental table, in which the abundance ratio of eIF2β from the 21-day-old sample is 0.753, meaning eIF2β is depleted in the 21-day-old sample. The authors should have explained the impact of the eIF2β depletion in the 21-day-old sample, so the reader could fully understand the authors' interpretation of the role of eIF2β on proteostasis.

      NEW TEXT: Thank you for pointing it out. We included plots of the 21-day-old proteome results as a part of the main figure (Figure 4C). As the reviewer pointed out, eIF2β protein levels are lower in milton knockdown background at the 21-day-old compared to the control. Since a reduction in the eIF2_β_ ameliorated milton knockdown-induced locomotor defects in aged flies (Figure 7D), the reduction in eIF2β observed in the 21-day-old milton knockdown flies is not likely to negatively contribute to milton knockdown-induced defects. Our new data indicate that eIF2β levels increase during aging in control flies until 49-day-old, then reduce at 63-day-old (included as Figure 4G in the revised manuscript). These age-dependent changes might explain the reduction in eIF2β levels in Milton knockdown compared to the control in middle age: higher eIF2β levels in milton knockdown flies at a young age than control and lower eIF2β levels in the middle-aged flies may reflect premature aging.

      NEW TEXT: We included these sentences in the discussion section:

      NEW TEXT: Lines 240-243:‘We also investigated age-dependent changes in eIF2β by western blotting of control flies at 7-, 21-, 35-, and 49-, and 63-day-old. eIF2β levels increased during aging until 49-day-old (Figure 4G). These results suggest that upregulation of eIF2β in milton knockdown fly brain reflects early an onset of age-dependent increase of eIF2β levels.’

      NEW TEXT: Lines 359-371: ‘Our results suggest that the loss of axonal mitochondria is an event upstream of proteostasis collapse during aging. The number of puncta of ubiquitinated proteins was higher in milton knockdown at 14-day-old, but there was no significant difference at 30-day-old (Figure 1). Proteome analyses also showed that age-related pathways, such as immune responses, are enhanced in young flies with milton knockdown (Table 2). We also found that eIF2β protein levels increase in an age-dependent manner until 49-day-old and reduces after that (Figure 4G). In the brains with neuronal knockdown of milton, eIF2β levels were higher at 7-day-old than those in control and lower at the 21-day-old (Figure 4D and Supplementary table). These results suggest that milton knockdown is likely accelerating age-dependent changes rather than increasing their magnitude. Disruption of proteostasis is expected to contribute neurodegeneration38 , and it would be interesting to analyze the sequence of protein accumulation and axonal degeneration in milton knockdown (24,29 and Figure 1) in detail with higher time resolution.’

      The manuscript consists of several weaknesses in its data and explanation regarding translation.

      (1) The authors are likely misunderstanding the effect of phosphorylation of eIF2α on translation. The P-eIF2α is inhibitory for translation initiation. However, the authors seem to be mistaken that the down-regulation of P-eIF2α inhibits translation.

      We are sorry for our insufficient explanation in the previous version. As the reviewer pointed out, it is well known that the phosphorylated form of eIF2α inhibits translation initiation. Neuronal knockdown of milton caused a reduction in p-eIF2α (Figure 5D and E (Figure 4J and K in the previous version)), and it also lowered translation (Figure 6 (Figure 5 in the previous version)); the relationship between these two events is currently unclear. We do not think that a reduction in the p-eIF2α suppressed translation; rather, we propose that the unbalance of expression levels of the components of eIF2 complexes negatively affects translation. We revised discussion sections to describe our interpretation more in detail as below:

      Line 374-384: ‘eIF2β is a component of eIF2, which meditates translational regulation and ISR initiation. When ISR is activated, phosphorylated eIF2α suppresses global translation and induces translation of ATF4, which mediates transcription of autophagy-related genes39,40. Since ISR can positively regulate autophagy, we suspected that suppression of ISR underlies a reduction in autophagic protein degradation. We found neuronal knockdown of milton reduced phosphorylated eIF2α, suggesting that ISR is reduced (Figure 5). However, we also found that global translation was reduced (Figure 6). Increased levels of eIF2β might disrupt the eIF2 complex or alter its functions. The stoichiometric mismatch caused by an imbalance of eIF2 components may inhibit ISR induction. Supporting this model, we found that eIF2β upregulation reduced the levels of p-eIF2α (Figure 7).’We have revised the graphical abstract and removed the eIF2 complex since its role in the loss of proteostasis caused by milton knockdown has not been elucidated yet.

      (2) The result of polysome profiling in Figure 4H is implausible. By 10%-25% sucrose density gradient, polysomes are not expected to be observed. The authors should have used a gradient with much denser sucrose, such as 10-50%.

      Thank you for pointing it out. It was a mistake of 10-50%, and we apologize for the oversight. It was corrected (Figure 6 (Figure 5 in the previous version)).

      (3) Also on the polysome profiling, as in the method section, the authors seemed to fractionate ultra-centrifuged samples from top to bottom and then measured A260 by a plate reader. In that case, the authors should have provided a line plot with individual data points, not the smoothly connected ones in the manuscript.

      Thank you for pointing it out. We revised the graph (Figure 6 (Figure 5 in the previous version)).

      (4) For both the results from polysome profiling and puromycin incorporation (Figure 4H and I), the difference between control siRNA and Milton siRNA are subtle, if not nonexistent. This might arise from the lack of spatial resolution in their experiment as the authors used head lysate for these data but the ratio of Phospho-eIF2α/eIF2α only changes in the axons, based on their results in Figure 4E-G. The authors could have attempted to capture the spatial resolution for the axonal translation to see the difference between control siRNA and Milton siRNA.

      Thank you for your comment. We agree that it would be an interesting experiment, but it will take a considerable amount of time to analyze axonal translation with spatial resolution. We will try to include such analyses in the future. For this manuscript, we revised the discussion section to include the reviewer's suggestion as below;

      Lines 355-357: ‘Further analyses to dissect the effects of milton knockdown on proteostasis and translation in the cell body and axon by experiments with spatial resolution would be needed.’

      Recommendations for the authors:

      From the Reviewing Editor:

      As the Reviewing Editor, I have read your manuscript and the associated peer reviews. I have concerns about publishing this work in its current form. I think that your manuscript cannot claim to have found a novel function of eIF2beta because of technical uncertainties and conceptual problems that should be addressed.

      Thank you so much for your review and comments. We addressed all the concerns raised by the reviewers. Point-by-point responses are listed below.

      First, your manuscript is based partly on what appears to be a mistaken understanding of the mechanistic basis of the ISR. Specifically, eIF2 is a heterotrimeric complex of alpha, beta, and gamma subunits. When eIF2a is phosphorylated, the heterotrimer adopts a new conformation. This conformation directly binds and inhibits eIF2B, the decameric GEF that exchanges the GDP bound to the gamma subunit of the eIF2 complex for GTP. Unless I misunderstood your paper, you seem to propose that decreasing levels of phospho-eIF2a will inhibit translation, but this is backward from what we know about the ISR.

      Thank you for your insightful comment, and we are sorry for the confusion. We did not mean to propose that decreasing levels of phospho-eIF2_a_ inhibits translation. We apologize for our insufficient explanation, which might have caused a misunderstanding (Lines 312-318 in the original version). We agree with the reviewer that ‘mismatch due to elevated eIF2-beta could change the behavior of the ISR’. We revised the text in the result section as follows:

      Lines 263-268 (in the Result section) ‘Phosphorylation of eIF2α induces conformational changes in the eIF2 complex and inhibits global translation36. To analyze the effects of milton knockdown on translation, we performed polysome gradient centrifugation to examine the level of ribosome binding to mRNA. Since p-eIF2α was downregulated, we hypothesized that milton knockdown would enhance translation. However, unexpectedly, we found that milton knockdown significantly reduced the level of mRNAs associated with polysomes (Figure 6A and B).’

      Lines 374-384 (in the Discussion section): ‘eIF2β is a component of eIF2, which meditates translational regulation and ISR initiation. When ISR is activated, phosphorylated eIF2α suppresses global translation and induces translation of ATF4, which mediates transcription of autophagy-related genes39,40. Since ISR can positively regulate autophagy, we suspected that suppression of ISR underlies a reduction in autophagic protein degradation. We found neuronal knockdown of milton reduced phosphorylated eIF2α, suggesting that ISR is reduced (Figure 5). However, we also found that global translation was reduced (Figure 6). Increased levels of eIF2β might disrupt the eIF2 complex or alter its functions. The stoichiometric mismatch caused by an imbalance of eIF2 components may inhibit ISR induction. Supporting this model, we found that eIF2β upregulation reduced the levels of p-eIF2α (Figure 7).’

      It may be possible that a stoichiometric mismatch due to elevated eIF2-beta could change the behavior of the ISR, but your paper doesn't adequately address the expression levels of all three eIF2 subunits: alpha, beta, and gamma. The proteomic data shown in Fig 4B is unconvincing on its own because the changes in the beta subunit are subtle. The Western blot in Figure 4C suggests that the KD changes the mass or mobility of the beta subunit, and most importantly, there are no Western blots measuring the levels of eIF2a, eIF2a-phospho, or eIF2-gamma.

      We appreciate the reviewer’s comment and agree that the stoichiometric mismatch due to elevated eIF2β may interfere with ISR. We found overexpression of eIF2β lowered p-eIF2 alpha (Figure S2 in V1), which supports this model. We included this data in the main figure in the revised manuscript (Figure 7D) and revised the text as below:

      Lines 286-289: ‘Since milton knockdown reduced the p-eIF2α level (Figure 5E), we asked whether an increase in eIF2β affects p-eIF2α. Neuronal overexpression of eIF2β did not affect the eIF2α level but significantly decreased the p-eIF2α level (Figure 7D and E).’

      Expression data of eIF2α and eIF2γ from proteomic analyses has been extracted from proteome analyses and included as a table (Figure 4D). Western blots of phospho-eIF2a (Figure S1 in V1) in the main figure (Figure 5B). The result section was revised as below;

      Lines 246-249: ‘As for the other subunits of eIF2 complex, proteome analysis did not detect a significant difference in the protein levels of eIF2α and eIF2γ between milton knockdown and control flies at 7 and 21 days (Figure 4D).’

      NEW TEXT: We also analyzed age-dependent changes of eIF2β by western blotting and found that eIF2β increased during aging until 49-day-old. We included this result as Figure 4G and added these sentences in the result section:

      NEW TEXT: Line 240-243: ‘We also investigated age-dependent changes in eIF2β by western blotting of control flies at 7-, 21-, 35-, and 49-, and 63-day-old. eIF2β levels increased during aging until 49-day-old (Figure 4G). These results suggest that upregulation of eIF2β in milton knockdown fly brain reflects early an onset of age-dependent increase of eIF2β levels.

      Reviewer #1 (Recommendations For The Authors):

      L125-128: In this section, while the efficiency of Milton knockdown is referenced from a previous publication, it is necessary to also mention that the Miro knockdown has been similarly reported in the literature. Additionally, the Methods section lacks details on the Miro RNAi line used, and Table 2 does not include the genotype for Miro RNAi. This information should be included for clarity and completeness.

      Thank you for pointing it out. Knockdown efficiency with this strain has been reported (Iijima- Ando et al., PLoS Genet, 2012). We revised the text to include citation and knockdown efficiency as follows:

      Lines 136-147: ‘There was no significant increase in ubiquitinated proteins in milton knockdown flies at 1-day old, suggesting that the accumulation of ubiquitinated proteins caused by milton knockdown is age-dependent (Figure S1). We also analyzed the effect of the neuronal knockdown of Miro, a partner of milton, on the accumulation of ubiquitin-positive proteins. Since severe knockdown of Miro in neurons causes lethality, we used UAS-Miro RNAi strain with low knockdown efficiency, whose expression driven by elav-GAL4 caused 30% reduction of Miro mRNA in head extract24. Although there was a tendency for increased ubiquitin- positive puncta in Miro knockdown brains, the difference was not significant (Figure 1B, p>0.05 between control RNAi and Miro RNAi). These data suggest that the depletion of axonal mitochondria induced by milton knockdown leads to the accumulation of ubiquitinated proteins before neurodegeneration occurs.’

      L132-L136: The current phrasing in this section suggests an increase in ubiquitinated proteins for both Milton and Miro knockdowns. However, since there is no significant difference noted for Miro, it is incorrect to state an increase in ubiquitin-positive puncta. Furthermore, combining the results of Milton knockdown to claim an increase in ubiquitinated proteins prior to neurodegeneration is misleading. At the very least, the expression here needs to be moderated to accurately reflect the findings.

      Thank you for pointing it out. We revised the text as above.

      L137-L141: Results in Figure 1 indicate that Milton knockdown leads to an increase in ubiquitinated proteins at 14 days, while Miro knockdown shows no difference from the control at either 14 or 30 days. Conversely, both the control and Miro exhibit an increase in ubiquitinated proteins with aging, but this trend does not seem to apply to Milton knockdown. This observation suggests that Milton KD may not affect the changes in protein quality control associated with aging. It implies that Milton's function might be more related to protein homeostasis in younger cells, or that changes due to aging might overshadow the effects of Milton knockdown. These interpretations should be included in the Results or Discussion sections for a more comprehensive analysis.

      NEW TEXT: Thank you for your insightful comment. As you mentioned, the accumulation of ubiquitinated proteins significantly increases only in young flies. Age-related pathways, such as immune responses, are highlighted in young milton knockdown flies but not in the aged flies. Our new result indicates that eIF2β increases during aging in control flies (included as Figure 4G in the revised manuscript), and upregulation of eIF2β in milton knockdown is only observed at a young age. These results suggest that milton knockdown does not increase the magnitude of age-dependent changes but accelerates their onset. We revised the text to include those points as follows:

      NEW TEXT: Lines 152-153: ‘These results suggest that depletion of axonal mitochondria may have more impact on proteostasis in young neurons than in old neurons.’

      NEW TEXT: Lines 359-371: ‘Our results suggest that the loss of axonal mitochondria is an event upstream of proteostasis collapse during aging. The number of puncta of ubiquitinated proteins was higher in milton knockdown at 14-day-old, but there was no significant difference at 30-day- old (Figure 1). Proteome analyses also showed that age-related pathways, such as immune responses, are enhanced in young flies with milton knockdown (Table 2). We also found that eIF2β protein levels increase in an age-dependent manner until 49-day-old and reduces after that (Figure 4G). In the brains with neuronal knockdown of milton, eIF2β levels were higher at 7-day-old than those in control and lower at the 21-day-old (Figure 4 and Supplementary table). These results suggest that milton knockdown is likely accelerating age-dependent changes rather than increasing their magnitude. Disruption of proteostasis is expected to contribute neurodegeneration38 , and it would be interesting to analyze the sequence of protein accumulation and axonal degeneration in milton knockdown (24,29 and Figure 1) in detail with higher time resolution.’

      L143 : Please remove the erroneously included quotation mark.

      Thank you for pointing it out. We corrected it.

      L145-L147:

      While it is understood that Milton knockdown results in a reduction of mitochondria in axons, as reported previously and seemingly indicated in Figure 1E, this paper repeatedly refers to axonal depletion of mitochondria. Therefore, it would be beneficial to quantitatively assess the number of mitochondria in the axonal terminals located in the lamina via electron microscopy. Such quantification would robustly reinforce the argument that mitochondrial absence in axons is a consequence of Milton knockdown.

      Thank you for pointing it out. We included quantitation of the number of mitochondria in the synaptic terminals (Figure 1E).

      The text and figure legend was revised accordingly:

      Lines 156-157: ‘As previously reported24, the number of mitochondria in presynaptic terminals decreased in milton knockdown (Figure 1E).’

      The knockdown of Milton is known to reduce mitochondrial transport from an early stage, but what about swelling? By observing swelling at 1 day and 14 days, it may be possible to confirm the onset of swelling and discuss its correlation with the accumulation of ubiquitinated proteins.

      Quantitation of axonal swelling has also been included (Figure 1F).

      We appreciate the reviewer's comments on the correlation between the accumulation of ubiquitinated proteins and axonal swelling. Axonal swelling was not observed at 3-days-old (Iijima-Ando et al., PLoS Genetics, 2012), indicating that axonal swelling is an age-dependent event. Dense materials are found in swollen axons more often than in normal axons, suggesting a positive correlation between disruption of proteostasis and axonal damage. It would be interesting to analyze the time course of events further; however, we feel it is beyond the scope of this manuscript. We revised the text to include this discussion as:

      Lines 157-160: ‘The swelling of presynaptic terminals, characterized by the enlargement and roundness, was not reported at 3-day-old24 but observed at this age with about 4% of total presynaptic terminals (Figure 1F, asterisks).’

      Lines 162-167: ‘Dense materials are rarely found in age-matched control neurons, indicating that milton knockdown induces abnormal protein accumulation in the presynaptic terminals (Figure 1G and H). In milton knockdown neurons, dense materials are found in swollen presynaptic terminals more often than in presynaptic terminals without swelling, suggesting a positive correlation between the disruption of proteostasis and axonal damage (Figure 1G).’

      Lines 369-371: ‘Disruption of proteostasis is expected to contribute neurodegeneration38 , and it would be interesting to analyze the sequence of protein accumulation and axonal degeneration in milton knockdown (24,29 and Figure 1) in detail with higher time resolution.’

      L147-L151: Though Figures 1F and 1G provide qualitative representations, it is advisable to quantitatively assess whether dense materials significantly accumulate. Such quantitative analysis would be required to verify the accumulation of dense materials in the context of the study.

      Thank you for pointing it out. We included quantitation of the number of neurons with dense material (Figure 1G). We revised the manuscript as follows:

      Line 162-164: ‘Dense materials are rarely found in age-matched control neurons, indicating that milton knockdown induces abnormal protein accumulation in the presynaptic terminals (Figure 1G and H).’

      Regarding Figure 1B, C:

      Even though the count of puncta in the whole brain appears to be fewer than 400, the magnification of the optic lobe suggests a substantial presence of puncta. Please clarify in the Methods section what constitutes a puncta and whether the quantification in the whole brain is based on a 2D or 3D analysis. Detail the methodology used for quantification.

      Thank you for your comment. We revised the method section to include more details as below:

      Lines 440-443: ‘Quantitative analysis was performed using ImageJ (National Institutes of Health) with maximum projection images derived from Z-stack images acquired with same settings. Puncta was identified with mean intensity and area using ImageJ.’

      What about 1-day-old specimens? Does Milton knockdown already show an increase in ubiquitinated protein accumulation at this early stage? Investigating whether ubiquitin-protein accumulation is involved in aging promotion or is already prevalent during developmental stages is a necessary experiment.

      Thank you for your comment. We carried out immunostaining with an anti-ubiquitin antibody in the brains at 1-day-old. No significant difference was detected between the control and milton knockdown. This result has been included as Figure S1 in the revised manuscript. The result section was revised as below:

      Line 136-139 ‘There was no significant increase in ubiquitinated proteins in milton knockdown flies at 1-day old, suggesting that the accumulation of ubiquitinated proteins caused by milton knockdown is age-dependent (Figure S1).’

      For Figure 1E: In the Electron Microscopy section of the Methods, define how swollen axons were identified and describe the quantification methodology used.

      Thank you for your comment. Swollen axons are, unlike normal axons, round in shape and enlarged. We revised the text as below;

      Lines 157-160: ‘The swelling of presynaptic terminals, characterized by the enlargement and roundness, was not reported at 3-day-old24 but observed at this age with about 4% of total presynaptic terminals (Figure 1F, asterisks).’

      Lines 689-691, Figure 1 legend: ‘Swollen presynaptic terminals (asterisks in (F)), characterized by the enlargement and higher circularity, were found more frequently in milton knockdown neurons.’

      L218-L219: Throughout the text, the expression 'eIF2β is "upregulated" in response to Milton knockdown' is frequently used. However, considering the presented results, it might be more accurate to interpret that under the condition of Milton knockdown, eIF2β is not undergoing degradation but rather remains stable.

      Thank you for pointing it out. We replaced ‘upregulated’ with ‘increased’ throughout the text.

      L234-L235: On what basis is the conclusion drawn that there is a reduction? Given that three experiments have been conducted, it would be possible and more convincing to quantify the results to determine if there is a significant decrease.

      Thank you for pointing it out. We quantified the AUC of polysome fraction and carried out a statistical analysis. There is a significant decrease in polysome in milton knockdown, and this result has been included in Figure 5B. We revised the figure and the legend accordingly.

      L236: 5H-> 4H

      Thank you for pointing it out, and we are sorry for the confusion. We corrected it.

      L238-L239: Since there is no significant difference observed, it may not be accurate to interpret a reduction in puromycin incorporation.

      Thank you for pointing it out. As described above, quantification of polysome fractions showed that milton knockdown significantly reduced polysome (Figure 6B (Figure 5B in the previous version)). We revised the manuscript as below;

      Lines 267-268: ‘However, unexpectedly, we found that milton knockdown significantly reduced the level of mRNAs associated with polysomes (Figure 6A and B).’

      Figure 5D and Figure 6D: Climbing assays have been conducted, but I believe experiments should also be performed to examine whether overexpression or heterozygous mutants of eIF2β induce or suppress degeneration.

      Thank you for pointing it out. We analyzed the eyes with eIF2β overexpression for neurodegeneration. Although there was a tendency of elevated neurodegeneration in the retina with eIF2β overexpression, the difference between control and eIF2β overexpression did not reach statistical significance (Figure S2). This result has been included as Figure S2 in the revised manuscript, and the following sentences have been included in the text:

      Lines 292-297: ‘We asked if eIF2β overexpression causes neurodegeneration, as depletion of axonal mitochondria in the photoreceptor neurons causes axon degeneration in an age- dependent manner24. eIF2β overexpression in photoreceptor neurons tends to increase neurodegeneration in aged flies, while it was not statistically significant (p>0.05, Figure S2).’

      L271-L272: The results in Figure 6B are surprising. I anticipated a greater increase compared to the Milton knockdown alone. While p62 appears to be reduced, it is not clear why these results lead to the conclusion that lowering eIF2β rescues autophagic impairment. Please add a discussion section to address this point.

      Thank you for pointing it out. We apologize for the unclear description of the result. Milton knockdown flies show p62 accumulation (Figure 2), and deleting one copy of eIF2beta in milton knockdown background reduced p62 accumulation (Figure 8C (Figure 7C in the previous version)). We revised the text as below:

      Lines 311-319: ‘Neuronal knockdown of milton causes accumulation of autophagic substrate p62 in the Triton X-100-soluble fraction (Figure 2B), and we tested if lowering eIF2β ameliorates it. We found that eIF2β heterozygosity caused a mild increase in LC3-I levels and decreases in LC3-II levels, resulting in a significantly lower LC3-II/LC3-I ratio in milton knockdown flies (Figure 8B). eIF2β heterozygosity decreased the p62 level in the Triton X-100-soluble fraction in the brains of milton knockdown flies (Figure 8C). The p62 level in the SDS-soluble fraction, which is not sensitive to milton knockdown (Figure 2B), was not affected (Figure 8C). These results suggest that suppression of eIF2β ameliorates the impairment of autophagy caused by milton knockdown.’

      L369: Please specify the source of the anti-ubiquitin antibody used.

      Thank you for pointing it out. We included the antibody information in the method section.

      Figure 7: While the relationship between Milton knockdown and the eIF2β and eIF2α proteins has been elucidated through the authors' efforts, I would like to see an investigation into whether eIF2β is upregulated and eIF2α phosphorylation is reduced in simply aged Drosophila. This would help us understand the correlation between aging and eIF2 protein dynamics.

      Thank you for your comment. We agree that it is an important question, and we are working on it. However, we feel that it is beyond the scope of the current manuscript.

      L645-L646: If the mushroom body is identified using mito-GFP, then include mito-GFP in the genotype listed in Supplementary Table 2.

      We are sorry for the oversight. We corrected it in Supplementary Table 2.

      Additionally, while it is presumed that the mito-GFP signal decreases in axons with Milton RNAi, how was the lobe tips area accurately selected for analysis? Please include these details along with a comprehensive description of the quantification methodology in the Methods section.

      Thank you for your comment. Although the mito-GFP signal in the axon is weak in the milton knockdown neurons, it is sufficient to distinguish the mushroom body structure from the background. We revised the method section to include this information in the method section:

      Line 443-447: ‘For eIF2α and p-eIF2α immunostaining, the mushroom body was detected by mitoGFP expression.’

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #2 (Recommendations for the authors):

      Line 364-370: This paragraph is not very clear to me.

      Thank you for pointing this out, we agree our point could have been made clearer. We have clarified as follows:

      “The geographic positions of species’ ranges determine the local pressures and environmental factors to which they are exposed (MacLean and Beissinger, 2017; Pacifici et al., 2020), potentially masking or confounding the effects of traits that evolved under conditions determined by range geography (Schuetz et al., 2019). This process could cause trait-related trends to differ across levels of biological organization (Srivastava et al., 2021), from local populations (where traits might be critical) to biogeographical extents (where traits might be unrelated to range or phenological shifts; Grewe et al., 2013; Gutiérrez and Wilson, 2021; Sunday et al., 2015; Zografou et al., 2021).” (Lines 370-377).

      Reviewer #3 (Recommendations for the authors):

      L313: '...higher population growth' compared to what? Does this mean that species shifting to earlier emergence really show higher population growth over time?

      Thank you for this suggestion, we have clarified as follows: “Earlier seasonal timing allows species to stay within their climatic limits and maintain population growth rates (Macgregor et al., 2019), although earlier emergence could expose individuals to early season weather extremes (McCauley et al., 2018).” (Lines 316-319).

      L336: Same here. Please refer to your comparative counterpart in such statements. Does 'plasticity may enable higher population growth' mean higher than for species shifting range or phenology or higher compared to the previous level for a given species. In many cases it seems you are referring to an overall baseline, so that the 'higher' means 'lesser decline'. Wouldn't plasticity maintain population growth at similar levels as before? The current wording suggests that plasticity results in species exceeding their previous population growth. Please rephrase.

      We agree it was confusing with no comparative counterpart, so we changed the sentence as follows: “Adaptive evolution and plasticity may enable high population growth rates in newly-colonized areas (Angert et al., 2020; Usui et al., 2023), but this possibility can only be directly tested with long term population trend data.” (Lines 341-343).

      L307: The term 'universal winners' appears too strong and not well justified given the lack of the crucial third dimension of response. In fact, changes in phenology are less indicative than abundance trends. Combined with range shifts they would tell a story of success or failing, while phenological shifts would rather help to understand how species adapted. I am not saying the insight cannot stand alone, but it is important to adapt the wording in this regard.

      Thank you for this comment, we have clarified the text as follows: “These results suggest that some species may have an advantage with respect to climate change: they demonstrate the flexibility to respond both temporally and spatially to the onset of rapid climate change.” (Lines 310-313).

      We also softened language around winners and losers on line 388: “It remains unclear if range and phenology shifts relate to trends in abundance, but our results suggest that there may be ‘winners’ and ‘losers’ under climate change (Figure 4).” (Lines 387-388).

      L326-240: I agree with line 330 that abundance trends are needed to clarify the situation of species shifting or not shifting ranges and phenology. However, this abstract should clarify that this is particularly important to understand whether non shifting species are really the 'losers'. If these species show adapted evolution or plasticity, we would expect they do not decline in abundance. Even without shifts in range or phenology they would be the 'ultimate winners' as you call it.

      Thank you for this comment, we agree that abundance trends are necessary to understand potential winners and losers. We have made this addition to the abstract as follows: “Species shifting in both space and time may be more resilient to extreme conditions, although further work integrating abundance data is needed.” (Lines 16-18).

    1. Reviewer #4 (Public review):

      Summary:

      The authors establish a behavioral paradigm for avoidance of H2S and conduct a large candidate screen to identify genetic requirements. They follow up by genetically dissecting a large number of implicated pathways - insulin, TGF-beta, oxygen/HIF-1, and mitochondrial ROS, which have varied effects on H2S avoidance. They additionally assay whole-animal gene expression changes induced by varying concentrations and durations of H2S exposure.

      Strengths:

      The implicated pathways are tested extensively through mutants of multiple pathway molecules. The authors address previous reviewer concerns by directly testing the ability of ASJ to respond to H2S via calcium imaging. This allows the authors to revise their previous conclusion and determine that ASJ does not directly respond to H2S and likely does not initiate the behavioral response.

      Weaknesses:

      Despite the authors focus on acute perception of H2S, I don't think the experiments tell us much about perception. I think they indicate pathways that modulate the behavior when disrupted, especially because most manipulations used broadly affect physiology on long timescales. For instance, genetic manipulation of ASJ signaling, oxygen sensing, HIF-1 signaling, mitochondrial function, as well as starvation are all expected to constitutively alter animal physiology, which could indirectly modulate responses to H2S. The authors rule out effects on general locomotion in some cases, but other physiological changes could relatively specifically modulate the H2S response without being involved in its perception.

      I am actually not convinced that H2S is directly perceived by the C. elegans nervous system at all. As far as I can tell, the avoidance behavior could be a response to H2S-induced tissue damage rather than the gas itself.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This paper sets out to achieve a deeper understanding of the effects of hydrogen sulfide on C. elegans behavior and physiology, with a focus on behavior, detection mechanism(s), physiological responses, and detoxification mechanisms.

      Strengths: 

      The paper takes full advantage of the experimental tractability of C. elegans, with thorough, welldesigned genetic analyses. 

      Some evidence suggests that H<SUB>2</SUB>S may be directly detected by the ASJ sensory neurons.  The paper provides interesting and convincing evidence for complex interactions between responses to different gaseous stimuli, particularly an antagonistic role between H<SUB>2</SUB>S and O2 detection/response.  Intriguing roles for mitochondria and iron homeostasis are identified, opening the door to future studies to better understand the roles of these components and processes. 

      We thank the reviewer for the supportive comments.

      Weaknesses: 

      The claim that worms' behavioral responses to H<SUB>2</SUB>S are mediated by direct detection is incompletely supported. While a role for the chemosensory neuron ASJ is implicated, it remains unclear whether this reflects direct detection. Other possibilities, including indirect effects of ASJ and the guanylyl cyclase daf-11 on O2 responses, are also consistent with the authors' data. 

      We thank the reviewer for the insightful comment and agree that the role of ASJ neurons in H<SUB>2</SUB>S detection was not clear. We included new experiments and revised our text to make it clearer.

      Since our initial analyses suggest a role of ASJ neurons in H<SUB>2</SUB>S-evoked locomotory responses (Figure 2F and G), We thought that this would offer us a starting point to dissect the neuronal circuit involved in H<SUB>2</SUB>S responses. Expression of the tetanus toxin catalytic domain in ASJ, which blocks neurosecretion, inhibited H<SUB>2</SUB>S-evoked locomotory speed responses (Figure 2H), suggesting that neurosecretion from ASJ promotes H<SUB>2</SUB>S-evocked response (Lines 162–165). We then performed calcium imaging of ASJ neurons in response to H<SUB>2</SUB>S exposure. However, while we observed CO<SUB>2</SUB>-evoked calcium transients in ASJ using GCaMP6s, we did not detect any calcium response to H<SUB>2</SUB>S, under several conditions, including animals on food, off food, and with different H<SUB>2</SUB>S concentrations and exposure times (Figure2—Figure supplement 2E and F) (Lines 166–170). Since signaling from ASJ neurons regulates developmental programs that modify sensory functions in C. elegans (Murakami et al., 2001), the involvement of ASJ neurons is not specific to H<SUB>2</SUB>S and ASJ neurons are unlikely to serve as the primary H<SUB>2</SUB>S sensor (Discussed in Line 449–458). Therefore, the exact sensory neuron, circuit and molecular triggers mediating acute H<SUB>2</SUB>S avoidance remain to be elucidated.

      Our subsequent investigation on mitochondrial components suggests that a burst of mitochondrial ROS production may be the trigger for H<SUB>2</SUB>S avoidance, as transient exposure to rotenone substantially increases baseline locomotory speed (Figure 7E) (Line 391–396). However, to initiate avoidance behavior to H<SUB>2</SUB>S, mitochondrial ROS could potentially target multiple neurons and cellular machineries, making it challenging to pinpoint specific sites of action. Nevertheless, we agree that further dissection of the neural circuits and mitochondrial signaling in H<SUB>2</SUB>S avoidance will be important and should be explored in future studies.

      The role of H<SUB>2</SUB>S-mediated damage in behavioral responses, particularly when detoxification pathways are disrupted, remains unclear. 

      We thank the reviewer for the insightful comment and fully agree with the concern raised. The same issue was also noted by the other reviewers. We agree that decreased locomotory responses in H<SUB>2</SUB>S-sensitized animals can arise from distinct causes, either systemic toxicity or behavioral adaptation, and distinguishing between these is critical. We have included new experiments and revised the text to clarify this issue.

      Our data suggest that increased initial omega turns and a rapid loss of locomotion in hif-1 and detoxification-defective mutants including sqrd-1 and ethe-1 likely reflect an enhanced sensitivity to H<SUB>2</SUB>S toxicity due to their failure to induce appropriate adaptative responses (Figure 5D–F, Figure 5J–L, Figure 5—Figure supplement 1F–P).  Supporting this, hif-1 mutants become less responsive to unrelated stimuli (near-UV light) after 30 minutes of H<SUB>2</SUB>S exposure (Figure 5I).

      In contrast, egl-9 and SOD-deficient animals show reduced initial omega-turn and reduced speed responses (Figure 5B, Figure 7G, Figure 5—Figure supplement 1A and B, and Figure 7—Figure supplement 1F and G), although both egl-9 and sod mutants respond normally to the other stimuli prior or after H<SUB>2</SUB>S exposure (Figure 5I, Figure 5—Figure supplement 1C, and Figure 7—Figure supplement 1H). Since disrupting egl-9 stabilizes HIF-1 and upregulates the expression of numerous genes involved in cellular defense against H<SUB>2</SUB>S toxicity, the enhanced detoxification capacity in egl-9 mutants likely increases animals’ tolerance to H<SUB>2</SUB>S, thereby reducing avoidance to otherwise toxic H<SUB>2</SUB>S levels. Similarly, persistently high ROS in SOD deficient animals activates a variety of stress-responsive signaling pathways, including HIF-1, NRF2/SKN-1 and DAF-16/ FOXO signaling (Lennicke & Cocheme, 2021; Patten et al., 2010), facilitating cellular adaptation to redox stress and reducing animals’ responsiveness to toxic H<SUB>2</SUB>S levels. Taken together, these findings support the view that reduced locomotory speed during H<SUB>2</SUB>S exposure can arise from distinct mechanisms: early systemic toxicity in hif-1 and detoxificationdefective mutants, versus enhanced cellular adaptation in egl-9 and SOD mutants. We have integrated the relevant information across the result section and discussed this in Lines 485-536. 

      The findings of the paper are somewhat disjointed, such that a clear picture of the relationships between H<SUB>2</SUB>S detection, detoxification mechanisms, mitochondria, and iron does not emerge from these studies. Most importantly, the relative roles of H<SUB>2</SUB>S detection and integration, vs. general and acute mitochondrial crisis, in generating behavioral responses are not convincingly resolved.  

      We thank the reviewer for this comment and agree that our presentation did not fully connect different findings into a cohesive picture. To address this, we have acquired new data, and revised the abstract, results and discussion sections to clarify two phases of H<SUB>2</SUB>S-evoked responses: an initial avoidance behavior upon H<SUB>2</SUB>S exposure, followed by a later phase of adaption and detoxification when the escape is not successful.

      In brief, we began with the basic characterization of H<SUB>2</SUB>S-induced locomotory speed response, followed by a candidate gene screen to identify key molecules and pathways involved in initial speed response to H<SUB>2</SUB>S. Subsequently, we focused on three major intersecting pathways that contributed to the acute behavioral response to H<SUB>2</SUB>S. These include cGMP signaling, which led to the identification of ASJ neurons; nutrient-sensitive pathways that modulate behavioral responses to both H<SUB>2</SUB>S and CO2; and O2sensing signaling, whose activation inhibits responses to H<SUB>2</SUB>S. However, the molecules and neurons in these pathways, including ASJ, likely play modulatory roles and are unlikely to serve as the primary H<SUB>2</SUB>S sensors. Our subsequent analysis, however, suggests that mitochondria play a critical role in triggering avoidance behavior upon H<SUB>2</SUB>S exposure. Brief treatment with rotenone, a potent inducer of ROS, led to marked increase in locomotory speed (Figure 7E). This suggests the possibility that a burst of ROS production triggered toxic levels of H<SUB>2</SUB>S (Jia et al., 2020) may initiate the avoidance behavior.

      When the initial avoidance fails, H<SUB>2</SUB>S detoxification programs are induced as a long-term survival strategy. The induction of detoxification programs appears to enhance tolerance to H<SUB>2</SUB>S exposure and contributes to the gradual decrease of locomotory speed in H<SUB>2</SUB>S. We now provide a clearer image of how different pathways modulate H<SUB>2</SUB>S detoxification and adaptation (see our responses to other comments). Briefly, mutants defective in detoxification, such as hif-1 and other detoxification-defective mutants, showed stronger initial omega-turn response and a rapid loss of locomotion. This loss of locomotion is likely caused by early cellular toxicity as the mutants failed to respond to other unrelated stimuli (nearUV light) after 30 minutes of H<SUB>2</SUB>S exposure (Figure 5I). Likewise, smf-3 mutants and BP-treated animals were hypersensitive to H<SUB>2</SUB>S (Figure 6D and E, and Figure 6—Figure supplement 1G and I), likely due to impaired H<SUB>2</SUB>S detoxification under low iron conditions, as iron is a co-factor required for the activity of the H<SUB>2</SUB>S detoxification enzyme ETHE-1 (Figure 5K and Figure 5—Figure supplement 1E).

      In contrast, reduced locomotion and response in other contexts such as egl-9 mutants and SODdeficient animals reflect H<SUB>2</SUB>S-induced adaptive mechanism rather than toxicity as they remain responsive to the other stimuli after H<SUB>2</SUB>S exposure. Since disrupting egl-9 stabilizes HIF-1 and upregulates the expression of numerous genes involved in cellular defense against H<SUB>2</SUB>S toxicity, the enhanced detoxification capacity in egl-9 mutants likely increases animals’ tolerance to H<SUB>2</SUB>S, thereby reducing avoidance to otherwise toxic H<SUB>2</SUB>S levels. Similarly, persistently high ROS in SOD deficient animals activates a variety of stress-responsive signaling pathways, including HIF-1, NRF2/SKN-1 and DAF-16/ FOXO signaling (Lennicke & Cocheme, 2021; Patten et al., 2010), facilitating cellular adaptation to redox stress and reducing animals’ responsiveness to toxic H<SUB>2</SUB>S levels. Therefore, different animals decline their locomotory speed to the effects of H<SUB>2</SUB>S through distinct mechanisms. We have integrated the relevant information across the result section and discussed this in Lines 485-536.

      Reviewer #2 (Public Review): 

      Summary: 

      H<SUB>2</SUB>S is a gas that is toxic to many animals and causes avoidance in animals such as C. elegans. The authors show that H<SUB>2</SUB>S increases the frequency of turning and the speed of locomotion. The response was shown to be modulated by a number of neurons and signaling pathways as well as by ambient oxygen concentrations. The long-term adaptation involved gene expression changes that may be related to iron homeostasis as well as the homeostasis of mitochondria. 

      Strengths: 

      Overall, the authors provide many pieces that will be important for solving how H<SUB>2</SUB>S signals through neuronal circuits to change gene expression and physiological programs. The experiments rely mostly on a behavioral assay that measures the increase of locomotion speed upon exposure to H<SUB>2</SUB>S. This assay is then combined with manipulations of environmental factors, different wild-type strains, and mutants. The mutants analyzed were obtained as candidates from the literature and from transcriptional profiling that the authors carried out in worms that were exposed to H<SUB>2</SUB>S. These studies imply several genetic signaling pathways, some neurons, and metabolism-related factors in the response to H<SUB>2</SUB>S. Hence the data provided should be useful for the field.  

      We thank the reviewer for the supportive comments.

      Weaknesses: 

      On the other hand, many important aspects of the underlying mechanisms remain unsolved and the reader is left with many loose ends. For example, it is not clear how H<SUB>2</SUB>S is actually sensed, how sensory neurons are activated and signal to downstream circuits, and what the role of ciliated and RMG neurons is in this circuit. It remains unclear how signals lead to gene expression and physiological changes such as metabolic rewiring. Solving all this would clearly be beyond the scope of a single manuscript. Yet, the manuscript also does not focus on understanding one of these central aspects and rather is all over the place, which makes it harder to understand for readouts that are not in this core field. Multiple additional methods and approaches exist to dig deeper into these mechanisms in the future, such as neuronal calcium imaging, optogenetics, and metabolic analysis. To generate a story that will be interesting to a broad readership substantial additional experimentation would be required. Further, in the current manuscript, it is often difficult to understand the rationales of the experiments, why they were carried out, and how to place them into a context. This could be improved in terms of documentation, narration/explanation, and visualization.  

      We thank the reviewer for the comment, which has also been raised by the other reviewers. We agree that our initial submission was poorly presented. We also acknowledge the fact that some aspects, such as detailed neural circuit and sensory transduction, still remain unresolved. We have now included additional experiments and revised the manuscript to clarify the logic of our experiments, provided better context for our findings, and improved both the narrative flow and data visualization to make the manuscript more accessible to readers. We now provide a clearer image of how different pathways interact to modulate the initial avoidance response, and the H<SUB>2</SUB>S detoxification and behavioral habituation during prolonged H<SUB>2</SUB>S exposure. The following response is similar to the one for reviewer #1.

      In brief, we began with the basic characterization of H<SUB>2</SUB>S-induced locomotory speed response, followed by a candidate gene screen to identify key molecules and pathways involved in initial speed response to H<SUB>2</SUB>S. Subsequently, we focused on three major intersecting pathways that contributed to the acute behavioral response to H<SUB>2</SUB>S. These include cGMP signaling, which led to the identification of ASJ neurons; nutrient-sensitive pathways that modulate behavioral responses to both H<SUB>2</SUB>S and CO2; and O2sensing signaling, whose activation inhibits responses to H<SUB>2</SUB>S. However, the molecules and neurons in these pathways, including ASJ, likely play modulatory roles and are unlikely to serve as the primary H<SUB>2</SUB>S sensors. Our subsequent analysis, however, suggests that mitochondria play a critical role in triggering avoidance behavior upon H<SUB>2</SUB>S exposure. Brief treatment with rotenone, a potent inducer of ROS, led to marked increase in locomotory speed (Figure 7E). This suggests the possibility that a burst of ROS production triggered toxic levels of H<SUB>2</SUB>S (Jia et al., 2020) may initiate the avoidance behavior.

      When the initial avoidance fails, H<SUB>2</SUB>S detoxification programs are induced as a long-term survival strategy. The induction of detoxification programs appears to enhance tolerance to H<SUB>2</SUB>S exposure and contributes to the gradual decrease of locomotory speed in H<SUB>2</SUB>S. We now provide a clearer image of how different pathways modulate H<SUB>2</SUB>S detoxification and adaptation (see our responses to other comments). Briefly, mutants defective in detoxification, such as hif-1 and other detoxification-defective mutants, showed stronger initial omega-turn response and a rapid loss of locomotion. This loss of locomotion is likely caused by early cellular toxicity as the mutants failed to respond to other unrelated stimuli (nearUV light) after 30 minutes of H<SUB>2</SUB>S exposure (Figure 5I). Likewise, smf-3 mutants and BP-treated animals were hypersensitive to H<SUB>2</SUB>S (Figure 6D and E, and Figure 6—Figure supplement 1G and I), likely due to impaired H<SUB>2</SUB>S detoxification under low iron conditions, as iron is a co-factor required for the activity of the H<SUB>2</SUB>S detoxification enzyme ETHE-1 (Figure 5K and Figure 5—Figure supplement 1E).

      In contrast, reduced locomotion and response in other contexts such as egl-9 mutants and SODdeficient animals reflect H<SUB>2</SUB>S-induced adaptive mechanism rather than toxicity as they remain responsive to the other stimuli after H<SUB>2</SUB>S exposure. Since disrupting egl-9 stabilizes HIF-1 and upregulates the expression of numerous genes involved in cellular defense against H<SUB>2</SUB>S toxicity, the enhanced detoxification capacity in egl-9 mutants likely increases animals’ tolerance to H<SUB>2</SUB>S, thereby reducing avoidance to otherwise toxic H<SUB>2</SUB>S levels. Similarly, persistently high ROS in SOD deficient animals activates a variety of stress-responsive signaling pathways, including HIF-1, NRF2/SKN-1 and DAF-16/ FOXO signaling (Lennicke & Cocheme, 2021; Patten et al., 2010), facilitating cellular adaptation to redox stress and reducing animals’ responsiveness to toxic H<SUB>2</SUB>S levels. Therefore, different animals decline their locomotory speed to the effects of H<SUB>2</SUB>S through distinct mechanisms. We have integrated the relevant information across the result section and discussed this in Lines 485-536.

      Reviewer #3 (Public Review): 

      Summary: 

      The manuscript explores the behavioral responses of C. elegans to hydrogen sulfide, which is known to exert remarkable effects on animal physiology in a range of contexts. The possibility of genetic and precise neuronal dissection of responses to H<SUB>2</SUB>S motivates the study of responses in C. elegans. The manuscript is well-written in communicating the complex physiology around C. elegans behavioral responses to H<SUB>2</SUB>S and in appropriately citing prior and related relevant work. 

      There are three parts to the manuscript.

      In the first, an immediate behavioral response-increased locomotory rate-upon exposure to H<SUB>2</SUB>S is characterized. The experimental conditions are critical, and data are obtained from exposure of animals to 150ppm H<SUB>2</SUB>S at 7% O2. The authors provide evidence that this is a chemosensory response to H<SUB>2</SUB>S, showing a requirement for genes encoding components of the cilia apparatus and implicating a role for tax-4 and daf-11. Neuron-specific rescue in the ASJ neurons suggests the ASJ neurons contribute to the response to H<SUB>2</SUB>S. One caveat is that previous work has shown that the dauer-constitutive phenotype of daf-11 mutants can be suppressed by ASJ ablation, suggesting that there may be pervasive changes in animal nervous system signaling that are ASJ-dependent in daf-11 mutants, which may indirectly alter chemosensory responses to H<SUB>2</SUB>S. More direct methods to assess whether ASJ senses H<SUB>2</SUB>S, e.g. using calcium imaging, would better assess a direct role for the ASJ neurons in a behavioral response to H<SUB>2</SUB>S. The authors also point out interesting parallels between the response to H<SUB>2</SUB>S and CO2 though provide some genetic data separating the two responses. Importantly, the authors note that when aerotaxis (O2sensing and movement) in the presence of bacterial food is intact, as in npr-1 215F animals, the response to H<SUB>2</SUB>S is abrogated. Mutation in gcy-35 in the npr-1 215F background restores the H<SUB>2</SUB>S chemosensory response. 

      There is a second part of the paper that conducts transcriptional profiling of the response to H<SUB>2</SUB>S that corroborates and extends prior work in this area. 

      The final part of the paper is the most intriguing, but for me, also the most problematic. The authors examine how H<SUB>2</SUB>S-evoked locomotory behavioral responses are affected in mutants defective in the stress and detoxification response to H<SUB>2</SUB>S, most notably hif-1. Prior genetic studies have established the pathways leading to HIF-1 activation/stabilization, as well as potential downstream mechanisms. The authors conduct logical genetic analysis to complement studies of the hif-1 mutant and in part motivated by their transcriptional profiling studies, examine the role of iron sequestration/free iron in the locomotory response to H<SUB>2</SUB>S, and further speculate on how the behavior of mutants defective in mitochondrial function might be affected by exposure to H<SUB>2</SUB>S. 

      In some regard, this part of the manuscript is interesting because the analysis begins to connect how the behavior of an animal to a toxic compound is affected by mutations that affect sensitivity to the toxic compound. However, what is unclear is what is being studied at this point. In the context, of noting that H<SUB>2</SUB>S at 150ppm is known to be lethal, its addition to mutants clearly sensitized to its effects would be anticipated to have pervasive effects on animal physiology and nervous system function. The authors note that the continued increased locomotion of wild-type animals upon H<SUB>2</SUB>S exposure might be due to the byproducts of detoxification or the detrimental effects of H<SUB>2</SUB>S. The latter explanation seems much more likely, in which case what one may be observing is the effects of general animal sickness, or even a bit more specifically, neuronal dysfunction in the presence of a toxic compound, on locomotion. As such, what is unclear is what conclusions can be taken away from this part of the work.  

      Strengths: 

      (1) Characterization of a motor behavior response to H<SUB>2</SUB>S 

      (2) Transcriptional profiling of the response to H<SUB>2</SUB>S corroborating prior work.  

      We thank the reviewer for the supportive comments.

      Weaknesses: 

      Unclear significance and experimental challenges regarding the study of locomotory responses to animals sensitized to the toxic effects of H<SUB>2</SUB>S under exposure to H<SUB>2</SUB>S. 

      We thank the reviewer for the comment, which has also been raised by the other reviewers. We agree that our initial submission left several important questions open, and we acknowledge the fact that some aspects, such as detailed neural circuit and sensory transduction, still remain unresolved. Nevertheless, we acquired new data and revised our text, aiming to clarify the distinct mechanisms underlying the reduced locomotion in different mutants during prolonged H<SUB>2</SUB>S exposure.

      Our data suggest that increased initial omega turns and a rapid loss of locomotion in hif-1 and detoxification-defective mutants including sqrd-1 and ethe-1 likely reflect an enhanced sensitivity to H<SUB>2</SUB>S toxicity due to their failure to induce appropriate adaptative responses (Figure 5D–F, Figure 5J–L, Figure 5—Figure supplement 1F–P).  Supporting this, hif-1 mutants become less responsive to unrelated stimuli (near-UV light) after 30 minutes of H<SUB>2</SUB>S exposure (Figure 5I).

      In contrast, egl-9 and SOD-deficient animals show reduced initial reorientation and reduced speed responses (Figure 5B, Figure 7G, Figure 5—Figure supplement 1A and B, and Figure 7—Figure supplement 1F and G), although both egl-9 and sod mutants respond normally to the other stimuli prior or after H<SUB>2</SUB>S exposure (Figure 5I, Figure 5—Figure supplement 1C, and Figure 7—Figure supplement 1H). Since disrupting egl-9 stabilizes HIF-1 and upregulates the expression of numerous genes involved in cellular defense against H<SUB>2</SUB>S toxicity, the enhanced detoxification capacity in egl-9 mutants likely increases animals’ tolerance to H<SUB>2</SUB>S, thereby reducing avoidance to otherwise toxic H<SUB>2</SUB>S levels. Similarly, constant high ROS in SOD deficient animals activates a variety of stress-responsive signaling pathways, including HIF-1, NRF2/SKN-1 and DAF-16/ FOXO signaling (Lennicke & Cocheme, 2021; Patten et al., 2010), facilitating cellular adaptation to redox stress and reducing animals’ responsiveness to toxic H<SUB>2</SUB>S levels. Taken together, these findings support the view that reduced locomotory speed during H<SUB>2</SUB>S exposure can arise from distinct mechanisms: early systemic toxicity in hif-1 and detoxification-defective mutants, versus enhanced cellular adaptation in egl-9 and SOD mutants. We have integrated the relevant information across the result section and discussed this in Lines 485-536.

      Reviewer #1 (Recommendations For The Authors): 

      To better substantiate a role for H<SUB>2</SUB>S detection, it would be useful for the authors to image Ca responses to H<SUB>2</SUB>S in ASJ in WT and unc-13, and to rule out the possibility that the requirement for daf-11 in ASJ reflects a role in O2 rather than H<SUB>2</SUB>S detection. 

      We thank the reviewer for this comment. As suggested, we performed calcium imaging of ASJ neurons using GCaMP6s. As previously described, 3% CO<SUB>2</SUB> evoked a calcium transient in ASJ (Figure 2—figure supplement 2F). To investigate whether H<SUB>2</SUB>S evoked a calcium transient in ASJ neurons, we tested several conditions, including animals on food or off food, with different H<SUB>2</SUB>S concentrations (~75 or ~150ppm), and different exposure time (4 or 8 mins). However, we did not detect a calcium response to H<SUB>2</SUB>S in ASJ under any of the conditions tested (Figure2—figure supplement 2E) (Lines 166–168). Given that neuronspecific rescue of daf-11 or tax-4 mutants pointed to a role of ASJ neurons in promoting H<SUB>2</SUB>S responses, we sought to determine how ASJ neurons were involved. Expression of the tetanus toxin catalytic domain in ASJ neurons, which blocks neurosecretion, inhibited H<SUB>2</SUB>S-evoked locomotory speed responses (Figure 2H), similar to the phenotypes observed in daf-11 and daf-7 mutants (Figure 2C and D) (Lines 162–165). These results confirm that ASJ activity and neurosecretion contribute to the H<SUB>2</SUB>S responses, although ASJ is unlikely to serve as the primary H<SUB>2</SUB>S sensor. One potential explanation is that DAF-7 released by ASJ controls the starvation program, which in turn modulates the animal’s response to H<SUB>2</SUB>S. We also discussed this in Lines 449–458.

      The paper would be significantly strengthened by testing the possibility (as the authors acknowledge in lines 348-52) that disruption of detoxification mechanisms reduces sustained behavioral responses to H<SUB>2</SUB>S because of physiological damage. Authors use acute exposure to high O2 for this purpose earlier in the paper, but not to probe the consequences of loss of hif-1 and detoxification factors.  

      We thank the reviewer for the valuable suggestion. As the reviewer highlighted, we attributed the brief locomotory speed responses to H<SUB>2</SUB>S observed in hif-1 mutants to the lack of detoxification response, leading to the rapid intoxication of the animals. Several lines of evidence support this conclusion. First, we observed that hif-1 and the detoxification mutants displayed a stronger initial reorientation response (omega turns) and a more rapid decline in speed and reversals compared to wild type (Figure 5 D–F). Second, to test if hif-1 mutants were indeed more susceptible to H<SUB>2</SUB>S toxicity, we exposed WT and hif-1 animals to H<SUB>2</SUB>S for 30 mins and subsequently tested their ability to respond to near-UV light. Unlike WT animals, the speed response to near-UV light was inhibited in hif-1 mutants (Figure 5I), suggesting that exposure to H<SUB>2</SUB>S for 30 min causes a stronger toxicity in animals deficient of HIF-1 signaling. Third, hif-1 and detoxification mutants displayed a sustained high speed in response to 1% O<SUB>2</SUB> , suggesting the specific impairment of H<SUB>2</SUB>S response. The data were presented in Lines 318–347, and were further discussed this in Lines 485–508.

      To better understand whether mitochondrial damage has a role in H<SUB>2</SUB>S-evoked behavior, it might be useful for the authors to determine whether general ROS response pathways are important for H<SUB>2</SUB>S behavioral responses.

      We thank the reviewer for this insightful comment. As suggested, we investigated whether ROS detoxification pathways contribute to H<SUB>2</SUB>S-evoked locomotory speed responses by analyzing mutants in the superoxide dismutase (SOD) family. These experiments, together with other observations, suggest that mitochondrial ROS play a dual role in H<SUB>2</SUB>S-evoked locomotion. The relevant results were presented in Lines 401–425, and were further discussed in Lines 509–536.

      First, we found that increased mitochondrial ROS formation, either induced pharmacologically by rotenone or genetically in mitochondrial electron transport chain (ETC) mutants (Ishii et al., 2013; Ochi et al., 2016; Ramsay & Singer, 1992; Yang & Hekimi, 2010; Zorov, Juhaszova, & Sollott, 2014), suppressed the behavioral response to toxic H<SUB>2</SUB>S (Figure 7A–E). This indicates that mitochondrial ROS plays a significant role in H<SUB>2</SUB>S-evoked responses. One likely explanation is that high ROS formation may dampen the H<SUB>2</SUB>S-triggered ROS spike, or may impair other H<SUB>2</SUB>S signaling processes required to initiate avoidance. Second, consistent with previous reports (Onukwufor et al., 2022), we observed that shortterm rotenone exposure (<1 hour) significantly increased baseline locomotory speed. Given that toxic H<SUB>2</SUB>S levels promote ROS formation (Jia et al., 2020), our findings suggest that acute mitochondrial ROS production by toxic levels of H<SUB>2</SUB>S exposure may serve as a trigger for the avoidance response.

      In contrast, animals with sustained mitochondrial ROS production do not have an increased baseline locomotory speed. This effect was observed after 2 hours of rotenone exposure, in mitochondrial ETC mutants, and in animals lacking all SOD enzymes (Figures 7A–K). A likely explanation for the reduced basal locomotory speed during sustained mitochondrial ROS production is the activation of ROSresponsive signaling pathways including HIF-1, NRF2/SKN-1, and DAF-16/FOXO (Lennicke & Cocheme, 2021; Patten, Germain, Kelly, & Slack, 2010), which may promote adaptation to prolonged oxidative stress (Figure 7H). Notably, unlike hif-1 mutants, SOD-deficient animals remained as responsive as WT to other stimuli after 30 minutes of H<SUB>2</SUB>S exposure (Figure 7—figure supplement 1H), indicating that elevated ROS levels do not compromise overall viability or the ability to detoxify H<SUB>2</SUB>S.

      Taken together, these results support a model in which mitochondrial ROS exerts a biphasic effect on H<SUB>2</SUB>S-induced avoidance. It enhances detection and avoidance under acute stress but contributes to locomotory suppression when ROS levels remain elevated chronically.

      Reviewer #2 (Recommendations For The Authors):

      The way the manuscript is presented could be improved without much effort by rewriting/editing. For the reader, it is hard at present to understand the rationales of the experiments, why they were carried out, and how to place them into a context. This could be improved on three levels:

      (1) Documentation 

      (2) Narration/Explanation 

      (3) Visualization 

      (1) Documentation

      Not all of the results in the text are well documented. The results should be described with more details in the written text and improved documentation and quantification of the results. Example: 

      Turning behavior is mentioned as an important aspect of the response to H<SUB>2</SUB>S. There is no citation given but this effect is not well documented. The authors image the animals and could provide video footage of the effect, could quantify eg turning/pirouettes, and provide the data. At the moment the manuscript largely relies on measuring the increase in speed, but the reader is left wondering what other behavioral effects occur and how this is altered in all of the mutant and other conditions tested. Just quantifying speed reduces the readout and seems like an oversimplification to characterize the behavioral response.  

      We are grateful for this comment. We now provide a video footage of the H<SUB>2</SUB>S effects (Figure 1—Video 1). As suggested, we analyzed the recordings to extract reorientation (omega-turns) and reversals. These analyses are now included in the Supplemental file 1 with representative panels displayed in Figure 5 and supplements to Figures 2, 3, 5, 6 and 7. Even though the mutant effects on omega-turns were often subtle, and reversal responses showed considerable variability (likely due to differences in population density, food availability, or animals’ physiological state prior to the assay), this analysis has proven valuable for distinguishing mutants that exhibit adaptation from those that display hypersensitivity to H<SUB>2</SUB>S toxicity. For instance, although both SOD-deficient and BP-treated animals failed to increase their locomotory speed in H<SUB>2</SUB>S (Figure 6E and Figure 7G), they exhibited distinct omega-turn responses (Figure 6—figure supplement 1I and Figure 7—figure supplement 1F), suggesting that different mechanisms likely underlie the locomotory defects of these two animals. We have integrated the omega-turn and reversal data into the text and discussed under relevant contexts.

      (2) Narration/Description.

      Generally, the description of the results part is very brief and it is often not clear why a certain experiment was carried out and how. Surely it is possible to check the methods but this interrupts the flow of reading and it would be easier for the reader to be guided through the results with more information what the initial motivation for an experiment is, what the general experimental outline is, and what specific experiments are carried out. 

      We apologize for the lack of clarity and logical structure in the initial submission. In the revised manuscript, we have thoroughly revised the text to improve its organization and readability.

      Examples: 

      Line 97ff: The authors performed a candidate screen yet it is not described why which genes were chosen. Are there also pathways that were tested that turned out to not be involved? 

      We thank the reviewer for the suggestion. To address this, we have added a new section, explaining the rationale for selecting genes and pathways in our candidate screen. Briefly, we focused on genes known or predicted to be involved in sensory responses to gaseous stimuli in C. elegans and mammals, including globins and guanylate cyclases (21% O<SUB>2</SUB> sensing), potassium channels (acute hypoxia), and nutrientsensitive pathways (CO<SUB>2</SUB> responses). We also included mutants defective in sensory signal transduction and neurotransmission. In addition, mitochondrial mutants were analyzed because mitochondria play a central role in H<SUB>2</SUB>S detoxification. The pathways that contributed to the acute H<SUB>2</SUB>S response included cGMP, insulin, and TGF-β signaling, as well as mitochondrial components. In contrast, globins, potassium channels, and biogenic amine signaling did not appear to play significant roles under our assay conditions. The results of the candidate screen are described in Lines 106–138 and summarized in Supplementary File 1.

      line 262ff: the paragraph starts with explaining ferritin genes that are important for iron control but the reader does not yet know why. Then it is explained that a ferritin gene is DE in the H<SUB>2</SUB>S transcriptomes. then a motivation to look into the labile iron pool is described. Why not first explain what genes are strongly regulated and why they are selected based on their DE? Then explain what is known about these genes and pathways, and then motivate a set of experiments. 

      We agree with the reviewer that our initial description could have been more logically organized. We reframed this section to first present the RNA-seq data, followed by an explanation of their known biological functions and the motivation for the subsequent experiments (Lines 350–357).

      nhr-49 appears suddenly in the results part and it is not clear why it was tested and how the result links. Is nhr-49 a key transcription factor that is activated by H<SUB>2</SUB>S sensory or physiological response, and does it control the signaling or protective changes induced by H<SUB>2</SUB>S?  

      We thank the reviewer for the comment. As suggested, we revised the text to present the information more clearly. In our candidate gene screen, a set of mutants exhibiting reduced speed responses to H<SUB>2</SUB>S has previously been shown to be defective in response to CO<SUB>2</SUB> stimulation (Hallem & Sternberg, 2008). These included animals deficient in nutrient-sensitive pathways, including insulin, TGF-beta, and NHR49, which were reported by Sternberg’s lab to exhibit dampened responses to CO<SUB>2</SUB> (Hallem & Sternberg, 2008) (Lines 173–179). We also included a simply cartoon to further illustrate this (Figure 3C).

      The nuclear hormone receptor NHR-49 has been implicated in a variety of stress responses, including starvation (Van Gilst, Hadjivassiliou, & Yamamoto, 2005), bacterial pathogen (Naim et al., 2021; Wani et al., 2021), and hypoxia (Doering et al., 2022). The nhr-49 mutants exhibited a rapid decline in locomotory speed during H<SUB>2</SUB>S exposure, implicating a role in sustaining high speed in the presence of H<SUB>2</SUB>S. Furthermore, we observed that fmo-2, a well-characterized target gene of NHR-49, was significantly upregulated after 1 hour of exposure to 50 and 150 ppm H<SUB>2</SUB>S (Supplementary file 2), suggesting that NHR-49 signaling is rapidly activated by H<SUB>2</SUB>S exposure. Exactly how NHR-49 contributes to H<SUB>2</SUB>S response requires further investigation.

      (3) Visualization 

      Adding a model/cartoon summary that describes the pathways tested and their interaction would be helpful in some of the figures for the reader to keep an overview of the pathways that were tested. Also, a final summary cartoon that integrates all the puzzle pieces into one larger picture would be helpful. Such a final cartoon overview could also point to the key open questions of the underlying mechanisms. 

      We thank the reviewer for this suggestion. We have added a series of models/cartoons to illustrate the different pathways and their interactions. These include starvation regulatory mechanisms (Figure 3C), 21% O<SUB>2</SUB> sensing mechanisms (Figure 3G), HIF-1 signaling and detoxification (Figure 5—figure supplement 1E), HIF-1 signaling and the regulation of labile iron (Figure 6H), as well as ROS signaling and regulation (Figure 7L). To help interpretation and to elaborate on these models, we have also included explanatory sentences in the corresponding figure legends.

      Other comments: 

      Introduction and line 93: The authors mention that 50 ppm H<SUB>2</SUB>S has beneficial effects on lifespan yet does not have a detectable phenotype." Are there any concentrations of H<SUB>2</SUB>S that cause attraction of C. elegans and what is the preferred range if it exists? Could this be measured in an H<SUB>2</SUB>S gradient? 

      We thank the reviewer for the insightful comment. We performed an H<SUB>2</SUB>S gradient assay, which suggests that wild type animals are attracted toward low concentrations of H<SUB>2</SUB>S around 40 ppm (Figure 1G and H) (Lines 95–104). These results suggest that H<SUB>2</SUB>S acts as a strong repellent for C. elegans at high concentrations but as an attractant at low levels. This dual role may be ecologically relevant, as wild C. elegans lives in complex and dynamic environments where H<SUB>2</SUB>S levels likely fluctuate over short distances (Adams, Farwell, Pack, & Bamesberger, 1979; Budde & Roth, 2011; Morra & Dick, 1991; Patange, Breen, Arsuffi, & Ruvkun, 2025; Rodriguez-Kabana, Jordan, & Hollis, 1965; Romanelli-Cedrez, Vairoletti, & Salinas, 2024).

      Line 146: "Local H<SUB>2</SUB>S concentrations could also be significantly higher in decomposing substances where wild C. elegans thrives" please provide a citation.

      As suggested, we included a set of references that have described the H<SUB>2</SUB>S enrichment in the natural environment in early field studies (Adams et al., 1979; Morra & Dick, 1991; Rodriguez-Kabana et al., 1965), as well as references that have discussed and implied this in C. elegans studies (Budde & Roth, 2011; Patange et al., 2025; Romanelli-Cedrez et al., 2024). They can be found in the introduction (Lines 59–62) and in the result (Lines 197–199).

      Line 311 "Wild C. elegans isolates thrive in the decomposing matters, where the local concentrations of O2 are low while the levels of CO2 and H<SUB>2</SUB>S could be high. These animals have adapted their behavior in such an environment, displaying increased sensitivity to high O2 exposure but dampened responses to CO2." Please provide citations for these statements.  

      As suggested, we cited the relevant articles or books describing the variation of O<SUB>2</SUB> and CO<SUB>2</SUB> levels in the decomposing matters including several C. elegans papers that mentioned this in Lines 197–199 (Bretscher, Busch, & de Bono, 2008; Gea, Barrena, Artola, & Sanchez, 2004; Hallem & Sternberg, 2008; Oshins, Michel, Louis, Richard, & Rynk, 2022), and the above-mentioned articles for H<SUB>2</SUB>S (Adams et al., 1979; Budde & Roth, 2011; Morra & Dick, 1991; Patange et al., 2025; Rodriguez-Kabana et al., 1965; Romanelli-Cedrez et al., 2024).

      For C. elegans’ sensitivity to O2 and CO2, these articles were cited in Lines 201–203 (Beets et al., 2020; Bretscher et al., 2008; Carrillo, Guillermin, Rengarajan, Okubo, & Hallem, 2013; Hallem & Sternberg, 2008; Kodama-Namba et al., 2013; McGrath et al., 2009).

      Reviewer #3 (Recommendations For The Authors): 

      More work could be conducted establishing the neuronal circuitry involved in the initial, tractable response to H<SUB>2</SUB>S. 

      We thank the reviewer for the insightful comment. Since our initial analyses suggest a role of ASJ neurons in H<SUB>2</SUB>S-evoked locomotory responses (Figure 2F and G), We thought that this would offer us an entry point to dissect the neuronal circuit involved in H<SUB>2</SUB>S responses. Expression of the tetanus toxin catalytic domain in ASJ, which blocks neurosecretion, inhibited H<SUB>2</SUB>S evoked locomotory responses (Figure 2H), suggesting that neurosecretion from ASJ promotes the speed response to H<SUB>2</SUB>S (Lines 162– 165). We then performed calcium imaging of ASJ neurons in response to H<SUB>2</SUB>S exposure. However, while we observed CO<SUB>2</SUB> -evoked calcium transients in ASJ using GCaMP6s, we did not detect any calcium response to H<SUB>2</SUB>S, under several conditions, including animals on food, off food, and with different H<SUB>2</SUB>S concentrations and exposure times (Figure2—Figure supplement 2E and 2F) (Lines 166–168). Since signaling from ASJ neurons regulates developmental programs that modify sensory functions in C. elegans, including CO<SUB>2</SUB> and O<SUB>2</SUB> responses (Murakami, Koga, & Ohshima, 2001), the involvement of ASJ neurons is not specific to H<SUB>2</SUB>S responses and ASJ neurons are unlikely to serve as a primary H<SUB>2</SUB>S sensor (Discussed in Line 449–458). Therefore, the exact sensory neuron, circuit and molecular triggers mediating acute H<SUB>2</SUB>S avoidance behavior remain to be elucidated.

      Our subsequent investigation on mitochondrial components suggests that a burst of mitochondrial ROS production may be the trigger for H<SUB>2</SUB>S avoidance, as transient exposure to rotenone substantially increases baseline locomotory activity (Figure 7E) (Line 391–396). However, mitochondrial ROS could potentially target multiple neurons and cellular machineries to initiate avoidance behavior to H<SUB>2</SUB>S, making it challenging to pinpoint specific sites of action. Nevertheless, we agree that further dissection of the neural circuits and mitochondrial signaling in H<SUB>2</SUB>S avoidance will be important and should be explored in future studies. We discussed this in Lines 509–536. 

      I am not sure how to overcome the challenges involved in reaching conclusions from the decreased locomotory responses of animals that are sensitized to the effects of H<SUB>2</SUB>S. Perhaps this conundrum could be discussed in more detail in the text. 

      We thank the reviewer for this important comment. We agree that decreased locomotory speed during H<SUB>2</SUB>S exposure can arise from distinct causes, either systemic toxicity or adaptation, and distinguishing between these is critical. We have included new experiments and revised the text to clarify this issue.

      Our data suggest that increased initial omega turns and a rapid loss of locomotion in hif-1 and detoxification-defective mutants including sqrd-1 and ethe-1 likely reflect an enhanced sensitivity to H<SUB>2</SUB>S toxicity due to their failure to induce appropriate adaptative responses (Figure 5D–F, Figure 5J–L, Figure 5—Figure supplement 1F–P).  Supporting this, hif-1 mutants become less responsive to unrelated stimuli (near-UV light) after 30 minutes of H<SUB>2</SUB>S exposure (Figure 5I).

      In contrast, egl-9 and SOD-deficient animals show reduced initial reorientation and reduced speed responses (Figure 5B, Figure 7G, Figure 5—Figure supplement 1A and B, and Figure 7—Figure supplement 1F and G), although both egl-9 and sod mutants respond normally to the other stimuli prior or after H<SUB>2</SUB>S exposure (Figure 5I, Figure 5—Figure supplement 1C, and Figure 7—Figure supplement 1H). Since disrupting egl-9 stabilizes HIF-1 and upregulates the expression of numerous genes involved in cellular defense against H<SUB>2</SUB>S toxicity, the enhanced detoxification capacity in egl-9 mutants likely increases animals’ tolerance to H<SUB>2</SUB>S, thereby reducing avoidance to otherwise toxic H<SUB>2</SUB>S levels. Similarly, persistently high ROS in SOD deficient animals activates a variety of stress-responsive signaling pathways, including HIF-1, NRF2/SKN-1 and DAF-16/ FOXO signaling (Lennicke & Cocheme, 2021; Patten et al., 2010), facilitating cellular adaptation to redox stress and reducing animals’ responsiveness to toxic H<SUB>2</SUB>S levels. Taken together, these findings support the view that reduced locomotory speed during H<SUB>2</SUB>S exposure can arise from distinct mechanisms: early systemic toxicity in hif-1 and detoxificationdefective mutants, versus enhanced cellular adaptation in egl-9 and SOD mutants. We have integrated the relevant information across the result section and discussed this in Lines 485–536. 

      References

      Adams, D. F., Farwell, S. O., Pack, M. R., & Bamesberger, W. L. (1979). Preliminary Measurements of Biogenic Sulfur-Containing Gas Emissions from Soils. Journal of the Air Pollution Control Association, 29(4), 380-383. doi:Doi 10.1080/00022470.1979.10470805

      Beets, I., Zhang, G., Fenk, L. A., Chen, C., Nelson, G. M., Felix, M. A., & de Bono, M. (2020). NaturaL Variation in a Dendritic Scaffold Protein Remodels Experience-Dependent Plasticity by Altering Neuropeptide Expression. Neuron, 105(1), 106-121 e110. doi:10.1016/j.neuron.2019.10.001  

      Bretscher, A. J., Busch, K. E., & de Bono, M. (2008). A carbon dioxide avoidance behavior is integrated with responses to ambient oxygen and food in Caenorhabditis elegans. Proc Natl Acad Sci U S A, 105(23), 8044-8049. doi:10.1073/pnas.0707607105

      Budde, M. W., & Roth, M. B. (2011). The response of Caenorhabditis elegans to hydrogen sulfide and hydrogen cyanide. Genetics, 189(2), 521-532. doi:10.1534/genetics.111.129841

      Carrillo, M. A., Guillermin, M. L., Rengarajan, S., Okubo, R. P., & Hallem, E. A. (2013). O-2-Sensing Neurons Control CO2 Response in C. elegans. Journal of Neuroscience, 33(23), 9675-9683. doi:10.1523/Jneurosci.4541-12.2013  

      Doering, K. R. S., Cheng, X., Milburn, L., Ratnappan, R., Ghazi, A., Miller, D. L., & Taubert, S. (2022). Nuclear hormone receptor NHR-49 acts in parallel with HIF-1 to promote hypoxia adaptation in Caenorhabditis elegans. Elife, 11. doi:10.7554/eLife.67911

      Gea, T., Barrena, R., Artola, A., & Sanchez, A. (2004). Monitoring the biological activity of the composting process: Oxygen uptake rate (OUR), respirometric index (RI), and respiratory quotient (RQ). Biotechnol Bioeng, 88(4), 520-527. doi:10.1002/bit.20281

      Hallem, E. A., & Sternberg, P. W. (2008). Acute carbon dioxide avoidance in Caenorhabditis elegans. Proc Natl Acad Sci U S A, 105(23), 8038-8043. doi:10.1073/pnas.0707469105

      Ishii, T., Miyazawa, M., Onouchi, H., Yasuda, K., Hartman, P. S., & Ishii, N. (2013). Model animals for the study of oxidative stress from complex II. Biochim Biophys Acta, 1827(5), 588-597. doi:10.1016/j.bbabio.2012.10.016

      Jia, J., Wang, Z., Zhang, M., Huang, C., Song, Y., Xu, F., . . . Cheng, J. (2020). SQR mediates therapeutic effects of H(2)S by targeting mitochondrial electron transport to induce mitochondrial uncoupling. Sci Adv, 6(35), eaaz5752. doi:10.1126/sciadv.aaz5752  

      Kodama-Namba, E., Fenk, L. A., Bretscher, A. J., Gross, E., Busch, K. E., & de Bono, M. (2013). Crossmodulation of homeostatic responses to temperature, oxygen and carbon dioxide in C. elegans. PLoS Genet, 9(12), e1004011. doi:10.1371/journal.pgen.1004011

      Lennicke, C., & Cocheme, H. M. (2021). Redox metabolism: ROS as specific molecular regulators of cell signaling and function. Mol Cell, 81(18), 3691-3707. doi:10.1016/j.molcel.2021.08.018

      McGrath, P. T., Rockman, M. V., Zimmer, M., Jang, H., Macosko, E. Z., Kruglyak, L., & Bargmann, C. I. (2009). Quantitative mapping of a digenic behavioral trait implicates globin variation in C. elegans sensory behaviors. Neuron, 61(5), 692-699. doi:10.1016/j.neuron.2009.02.012

      Morra, M. J., & Dick, W. A. (1991). Mechanisms of h(2)s production from cysteine and cystine by microorganisms isolated from soil by selective enrichment. Appl Environ Microbiol, 57(5), 14131417. doi:10.1128/aem.57.5.1413-1417.1991

      Murakami, M., Koga, M., & Ohshima, Y. (2001). DAF-7/TGF-beta expression required for the normal larval development in C-elegans is controlled by a presumed guanylyl cyclase DAF-11. Mechanisms of Development, 109(1), 27-35. doi:Doi 10.1016/S0925-4773(01)00507-X

      Naim, N., Amrit, F. R. G., Ratnappan, R., DelBuono, N., Loose, J. A., & Ghazi, A. (2021). Cell nonautonomous roles of NHR-49 in promoting longevity and innate immunity. Aging Cell, 20(7). doi:ARTN e13413 10.1111/acel.13413

      Ochi, R., Dhagia, V., Lakhkar, A., Patel, D., Wolin, M. S., & Gupte, S. A. (2016). Rotenone-stimulated superoxide release from mitochondrial complex I acutely augments L-type Ca2+ current in A7r5 aortic smooth muscle cells. Am J Physiol Heart Circ Physiol, 310(9), H1118-1128. doi:10.1152/ajpheart.00889.2015  

      Onukwufor, J. O., Farooqi, M. A., Vodickova, A., Koren, S. A., Baldzizhar, A., Berry, B. J., . . . Wojtovich, A. P. (2022). A reversible mitochondrial complex I thiol switch mediates hypoxic avoidance behavior in C. elegans. Nat Commun, 13(1), 2403. doi:10.1038/s41467-022-30169-y

      Oshins, C., Michel, F., Louis, P., Richard, T. L., & Rynk, R. (2022). Chapter 3 - The composting process. In R. Rynk (Ed.), The Composting Handbook (pp. 51-101): Academic Press.  

      Patange, O., Breen, P., Arsuffi, G., & Ruvkun, G. (2025). Hydrogen sulfide mediates the interaction between C. elegans and Actinobacteria from its natural microbial environment. Cell Reports, 44(1), 115170. doi:10.1016/j.celrep.2024.115170

      Patten, D. A., Germain, M., Kelly, M. A., & Slack, R. S. (2010). Reactive oxygen species: stuck in the middle of neurodegeneration. J Alzheimers Dis, 20 Suppl 2, S357-367. doi:10.3233/JAD-2010100498

      Ramsay, R. R., & Singer, T. P. (1992). Relation of superoxide generation and lipid peroxidation to the inhibition of NADH-Q oxidoreductase by rotenone, piericidin A, and MPP+. Biochem Biophys Res Commun, 189(1), 47-52. doi:10.1016/0006-291x(92)91523-s

      Rodriguez-Kabana, R., Jordan, J. W., & Hollis, J. P. (1965). Nematodes: Biological Control in Rice Fields: Role of Hydrogen Sulfide. Science, 148(3669), 524-526. doi:10.1126/science.148.3669.524

      Romanelli-Cedrez, L., Vairoletti, F., & Salinas, G. (2024). Rhodoquinone-dependent electron transport chain is essential for Caenorhabditis elegans survival in hydrogen sulfide environments. J Biol Chem, 300(9), 107708. doi:10.1016/j.jbc.2024.107708

      Van Gilst, M. R., Hadjivassiliou, H., & Yamamoto, K. R. (2005). A Caenorhabditis elegans nutrient response system partially dependent on nuclear receptor NHR-49. Proc Natl Acad Sci U S A, 102(38), 13496-13501. doi:10.1073/pnas.0506234102

      Wani, K. A., Goswamy, D., Taubert, S., Ratnappan, R., Ghazi, A., & Irazoqui, J. E. (2021). NHR- 49/PPAR-α and HLH-30/TFEB cooperate for   host defense via a flavin-containing monooxygenase. Elife, 10. doi:ARTN e62775 10.7554/eLife.62775

      Yang, W., & Hekimi, S. (2010). A mitochondrial superoxide signal triggers increased longevity in Caenorhabditis elegans. PLoS Biol, 8(12), e1000556. doi:10.1371/journal.pbio.1000556

      Zorov, D. B., Juhaszova, M., & Sollott, S. J. (2014). Mitochondrial reactive oxygen species (ROS) and ROS-induced ROS release. Physiol Rev, 94(3), 909-950. doi:10.1152/physrev.00026.2013

    1. Reviewer #1 (Public review):

      The revised manuscript addresses several reviewer concerns, and the study continues to provide useful insights into how ZIP10 regulates zinc homeostasis and zinc sparks during fertilization in mice. The authors have improved the clarity of the figures, shifted emphasis in the abstract more clearly to ZIP10, and added brief discussion of ZIP6/ZIP10 interactions and ZIP10's role in zinc spark-calcium oscillation decoupling. However, some critical issues remain only partially addressed.

      (1) Oocyte health confound: The use of Gdf9-Cre deletes ZIP10 during oocyte growth, meaning observed defects could result from earlier disruptions in zinc signaling rather than solely from the absence of zinc sparks at fertilization. The authors acknowledge this and propose transcriptome profiling as a future direction. However, since mRNA levels often do not accurately reflect protein levels and activity in oocytes, transcriptomics may not be particularly informative in this context. Proteomic approaches that directly assess the molecular effects of ZIP10 loss seem more promising. Although current sensitivity limitations make proteomics from small oocyte samples challenging, ongoing improvements in this area may soon allow for more detailed mechanistic insights.

      (2) ZIP6 context and focus: The authors clarified the abstract to emphasize ZIP10, enhancing narrative clarity. This revision is appropriate and appreciated.

      (3) Follicular development effects: The biological consequences of ZIP6 and ZIP10 knockout during folliculogenesis are still unknown. The authors now say these effects will be studied in the future, but this still leaves a major mechanistic gap unaddressed in the current version.

      (4) Zinc spark imaging and probe limitations: The addition of calcium imaging enhances the clarity of Figure 3. However, zinc fluorescence remains inadequate, and the authors depend solely on FluoZin-3AM, a dye known for artifacts and limited ability to detect subcellular labile zinc. The suggestion that C57BL/6J mice may differ from CD1 in vesicle appearance is plausible but does not fully address concerns about probe specificity and resolution. As the authors acknowledge, future studies with more selective probes would increase confidence in both the spatial and quantitative analysis of zinc dynamics.

      (5) Mechanistic insight remains limited: The revised discussion now recognizes the lack of detailed mechanistic understanding but does not significantly expand on potential signaling pathways or downstream targets of ZIP10. The descriptive data are useful, but the inability to pinpoint how ZIP10 mediates zinc spark regulation remains a key limitation. Again, proteomic profiling would probably be more informative than transcriptomic analysis for identifying ZIP10-dependent pathways once technical barriers to low-input proteomics are overcome.

      Overall, the authors have reasonably revised and clarified key points raised by reviewers, and the manuscript now reads more clearly. However, the main limitation, lack of mechanistic insight and the inability to distinguish between developmental and fertilization-stage roles of ZIP10, remains unresolved. These should be explicitly acknowledged when framing the conclusions.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The authors investigated the role of the zinc transporter ZIP10 in regulating zinc sparks during fertilization in mice. By utilizing oocyte-specific Zip6 and Zip10 conditional knockout mice, the authors effectively demonstrate the importance of ZIP10 in zinc homeostasis, zinc spark generation, and early embryonic development. The study is overall useful as it identifies ZIP10 as an important component of oocyte processes that support embryo development, thus opening the door for further investigations. While the study provides solid evidence for the requirement of ZIP10 in the regulation of zinc sparks and zinc homeostasis, it falls short of revealing the underlying mechanism of how ZIP10 exerts this important function.

      This report is the first to clarify the role of the zinc transporters ZIP10 expressed in oocytes, which was previously unknown, and does not focus on the detailed mechanism. As you pointed out, we believe that the mechanism will also be important information in the field of fertilization and embryogenesis research, and we believe that it is necessary to consider this issue in the future.

      (1) The zinc transporters the authors are knocking out are expressed in mouse oocytes through follicular development, and the Gdf9-cre driver used means these oocytes were grown in the absence of appropriate Zinc signaling. Thus, it would be difficult to assert that the lack of fertilization associated with zinc sparks is solely responsible for the failure of embryo development. Spindle morphology and other meiotic parameters do not necessarily report oocyte health, so normalcy of these features may not be a strong argument when it comes to metabolic issues.

      As you rightly observe, the results of this study do not entirely exclude the possibility of oocyte health in the absence of adequate zinc homeostasis during oocyte growth. However, evidence has been presented demonstrating that spindle formation does occur in Zip10<sup>d/d</sup> mouse oocytes (Fig.2 C), that fertilization occurs despite the absence of zinc spark (Fig.3 and Fig. 4A), and that some embryos develop to blastocysts (Fig. 4 B). We believe that future studies should evaluate the transcriptome profile of Zip10<sup>d/</sup> mouse oocytes.

      (2) While comparing ZIP6 and ZIP10 in the abstract provides context, focusing more on ZIP10 would improve reader comprehension, as ZIP10 is the primary focus of the study. Emphasizing the specific role of ZIP10 will help the reader grasp the core findings more clearly.

      Thank you for your valuable input. We have revised the summary to focus more on ZIP10 by removing the section in the summary that mentions ZIP6 (P.1-2 Line 34-52).

      (3) Zinc transporters ZIP6 and ZIP10 are expressed during follicular development, but the biological significance of the observation is not clearly addressed. The authors should investigate whether the ZIP6 and ZIP10 knockout affects follicular development and discuss the potential implications.

      Thank you for your valuable input. As you mentioned, we have not been able to clarify the effects of ZIP6 and ZIP10 knockout on follicle formation. However, this report clarifies the role of ZIP-mediated zinc ions in their inclusion. The effect of ZIP knockout on follicle formation will be discussed in the future.

      (4) In Figure 3, the zinc fluorescence images are unclear, making it difficult for readers to interpret the data. Including snapshot images of calcium and zinc spikes as part of the main figure would improve clarity. Moreover, adding more comparative statements and a deeper explanation of why Zip10 KO mice exhibit normal calcium oscillations but lack zinc sparks would strengthen the manuscript.

      Thank you for your suggestion. We have also added images of calcium elevation after fertilization to Fig. 3 and Fig. S3. In addition, the figure legends have been changed (P.29 Line 937-939, P.34 Line 1104-1106). As to why Zip10 KO mice show normal calcium oscillations but lack zinc spikes, as mentioned in Discussion (P. 10 Line 299-300), we speculate that zinc ions existed in Zip10d/d mouse oocytes induce Ca2+ release without compromising IP3R1 sensitivity. We also assume that the lack of zinc spark is due to low accumulation of zinc ion levels in the oocytes via ZIP10, as described in Discussion (P.10 Line 300-302).

      (5) While the study identifies the role of ZIP10 in zinc spark generation, it lacks a clear mechanistic insight. The topic itself is interesting, but without providing a more detailed explanation of the underlying mechanisms, the study leaves an important gap. Further discussion on the signaling pathways potentially involved in zinc spark regulation would add depth to the findings.

      Thank you for your input. This report is the first to clarify the role of the zinc transporters ZIP6 and ZIP10 expressed in oocytes, which was previously unknown, and does not focus on the detailed mechanism. As you pointed out, we believe that the mechanism and signaling pathways will also be important information, and we believe that it is necessary to research this issue in the future.

      Reviewer #2 (Public review):

      Summary:

      In this important study, the authors examine the role of two zinc uptake transporters, Zip6 and Zip10, which are important during the maturation of oocytes, and are critical for both successful fertilization and early embryogenesis.

      Strengths:

      The authors report that oocytes from Zip10 knockout mice exhibit lower labile zinc content during oocyte maturation, decreased amounts of zinc exocytosis during fertilization, and affect the rate of blastocyst generation in fertilized eggs relative to a control strain. They do not observe these changes in their Zip6 knockout animals. The authors present clear and well-documented results from a broad range of experimental modalities in support of their conclusions.

      Thank you for your positive comments.

      Weaknesses:

      (1) The authors' statement that Zip10 is not expressed in the oocyte nuclei (line 252). Furthermore, in that study, ZIP10 was detected in the nuclear/nucleolar positions of oocytes of all follicular stages (Chen et al., 2023), which we did not observe. This is not supported by Figure 1, where some Zip10 signal is apparent in the primordial, primary, and secondary follicle oocytes. This statement should be corrected.

      Thank you for pointing this out. Our results of ISH staining (Fig. 1A) and immunofluorescence staining (Fig. 1B) showed that it was not detected at the nucleus/nucleolus location. In other words, they could not be detected at the mRNA and protein levels. Based on the results of ISH staining and immunofluorescence staining, we conclude that it is expressed in the plasma membrane.

      (2) Based on the FluoZin-3AM data, there appears to be less labile zinc in the Zip10d/d oocyte, eggs, and embryos; however, FluoZin-3AM has a number of well-known artifacts and does not accurately capture the localization of labile zinc pools. The patterns do not correspond to the well-documented zinc-containing cortical vesicles. Another zinc probe, such as ZinPyr-4 or ZincBY-1 should be used to visualize the zinc vesicles and confirm that there is less labile zinc in these locations as well.

      Thank you for your suggestion. Previous studies (Lisle et al., 2013, Reproduction) and our report (Kageyama et al., 2022, Animal Science Journal) have shown that it is possible to examine the presence of labile zinc ions in oocytes and embryos. In addition, mouse oocytes (embryos) reported in previous studies are from CD1 (ICR) mice, whereas our study was conducted using C57BL/6J mice. In our report (Kageyama et al., 2024, Journal of Reproduction and Development), we reported that the appearance of zinc vesicles in the oocytes observed by Fluozin-3AM staining in CD1 and C57BL/6J mice is different, and we believe that this appearance of cortical vesicles in C57BL/6J mice is not a problem. As you say, we have not used other zinc probes and will consider this in the future.

      (3) Line 268 The results indicate that ZIP10 is mostly responsible for the uptake of zinc ions in mouse oocytes. The situation seems a bit more complicated given that the differences in labile zinc content between oocytes from the WT and Zip10d/d animals are small (only 20-30 %) and that the zinc spark is diminished but still apparent at a low level in the Zip10d/d oocytes. Clearly, other factors are involved in zinc uptake at these stages. A variety of studies have suggested that Zip6 and Zip10 work together, perhaps even functioning as a heterodimer in some systems. The double KO would address this more clearly, but if it is not available, it might be more prudent to state that Zip10 plays some role in uptake of zinc in mouse oocytes while the role of Zip6 remains uncertain.

      We would like to express our gratitude for the comments received. The phenotype of double knockout mice for ZIP6 and ZIP10 will be discussed at a future date. We have also added to the text that the role of ZIP6 remains uncertain (P. 11 Line 353-354).

      (4) Zip6d/d oocytes did not have changes in labile zinc, nor did the lack of Zip6 have an impact on the zinc spark. However, Figure S1 does show a small amount of detectable Zip6 in the western blot. It is possible that this small amount could compensate for the complete lack of Zip6. Can ZIP6 be found in immunofluorescence of GV oocytes or MII eggs from the Zip6d/d animals? Additionally, it is possible that Zip6's role is only supplementary to that of Zip10. The authors should discuss this possibility. It would also be interesting to see if the Zip6/Zip10 double knockout displays greater defects compared to the Zip10 knockout when considering previous studies.

      Thank you for your input. The mice are deficient in the gene so that ZIP6 is not functional. It is our notion that the results of WB analysis are not indicative of protein structural functionality, even in cases where the ZIP6 antibody detects a small amount of protein. Since the role of ZIP6 was not elucidated in this study, we added a statement to that effect in the text (P. 11 Line 353-354). In addition, studies using ZIP6/Zip10 double knockout mice will be discussed in the future.

      Recommendations for the authors: please note that you control which revisions to undertake from the public reviews and recommendations for the authors.

      We have revised the text based on the reviewerʼs suggestions.

      Reviewer #1 (Recommendations for the authors):

      (1) In lines 133-136, it seems that the authors would like to aim to emphasize the lack of research on oocytes compared to other tissues and cells. However, the inclusion of unrelated contexts, such as the role of ZIP10 in cancer and skin, appears unnecessary and detracts from the focus on oocyte-specific mechanisms. Removing these unrelated sentences would help maintain clarity and relevance in the introduction.

      *As you indicated, we removed the sentence that is not related to oocytes (P.4 Line 120-125). Further, they reported that targeted disruption using Zip6- and Zip10- specific morpholino injection or antibody incubation induced alteration of the intracellular labile zinc content, spontaneous resumption of meiosis from the PI arrest and premature arrest at a telophase I-like state (Kong et al., 2014). It is clear from these reports that ZIP6 and ZIP10 are involved in zinc transport in oocytes, but the function is not elucidated.”

      (2) Ensure that all video files are properly labeled to enhance understanding.

      Improved video labels for clarity (Movie 1-8, Movie S1-S4)

      (3) Correct mislabeling issues, such as the one in line 209.

      Corrected as follows: Zip10<sup>d/d</sup> mouse oocytes can be fertilized but were unlikely to develop to blastocysts (P. 6-7 Line 196-197).

      (4) In Figure 4D, the amount of ZIP2 appears to increase relative to actin. Including quantification would make the data more robust. Similarly, in Figure 4F, JUNO levels appear increased in Zip10 KO. Please provide quantification.

      The WB band images in Fig. 4D were quantified and their graphs were added to lower part of Fig. 4D. Furthermore, the Juno of Immunofluorescent images in Figure 4F were quantified and their graphs were added to Fig. S4. Figure legends and text were corrected and added.P. 30 Line 975-979: Expression level of β-actin serves as a protein loading control and quantified the expression level of ZP2. Molecular mass is indicated at the left. Statistical differences were calculated according to the one-way ANOVA. Different letters represent significant differences (p < 0.05).

      P. 35 Line: Fig. S4 Comparison of JUNO expression in Zip10<sup>f/f</sup> and Zip10<sup>d/d</sup> mouse MII oocytes. To measure JUNO-immunofluorescence intensity, oocytes images were selected as regions of interest (ROIs) and measured using ImageJ. Statistical differences were calculated according to student’s t-test (p > 0.05; no significant difference).P.7 Line 206-209: As for the expression of JUNO, it had the same expression than between null and control oocytes (Fig. S4) and the temporal dynamics of its disappearance from the cortex after fertilization was similar for both Zip10<sup>f/f</sup> and Zip10<sup>d/d</sup> groups (Fig. 4F).

      (5) Some of the sentences lack proper references.

      The entire text was reviewed and references inserted where necessary.

      P.7 Line 221, P.7 Line222-223, P.8 Line 253-254, P.12 Line 358-360 and P.24 Line 698-699.

      Reviewer #2 (Recommendations for the authors):

      Revisions are warranted in order to address the issues noted in the Weaknesses section of the Public Review. 

      Thank you for your comments, we have individually addressed the areas you pointed out in the Weaknesses section. The following text has also been corrected and edited.

      (1) Line 247 "In primordial follicles, the ooplasmic staining of ZIP10 we anticipate corresponds to ooplasmic vesicular sites. 

      The text of P. 8 Line 230-232 was revised as follows.

      "In primordial follicles, the ooplasm staining of ZIP10 we anticipate corresponds to ooplasmic vesicular sites.

      (2) Line 926 "ZP2 was not stained in primordial follicle, but primary, secondary, and antral follicles stained. FOXL2 was observed in granulosa cells in 928 of all stage follicles. The scale bar represents 20 μm of primordial-secondary follicle and 150 μm of antral follicle." All three sentences have grammar issues that should be fixed. 

      The text of p.28 Line 908-911 was revised as follows.

      It was observed that ZP2 was not present in the primordial follicle; however, it was present in the primary, secondary and antral follicles. Furthermore, FOXL2 was observed at granulosa cells of all stage follicles. Scale bar: 20 µm (primordial, primary and secondary follicle); 150 µm (antral follicle).

    1. Reviewer #1 (Public review):

      Summary:

      The authors performed an in-depth analysis of three mouse strains with different levels of susceptibility to metabolic disease. Transcriptomics analyses of relevant deep tissues revealed many strain-specific differences in response to diet. They used gene set enrichment analysis to highlight possible biological pathways that may be involved in obesity and its metabolic consequences. These results were then confirmed using public data in both mice and humans.

      Strengths:

      Overall, this is an interesting study into the biological basis of differing phenotypic outcomes in response to metabolic challenges. The findings uncover several pathways that may shed light on the etiology of obesity and the associated health risks, as well as offer potential therapeutic avenues to prevent them.

      Weaknesses:

      While the experimental design and analysis are mostly good, some aspects of the present paper could be improved.

      (1) Most results are insufficiently described. P-values are almost entirely absent in the main text. Sometimes the significance is indicated in the figures, and other times it is missing. For example, strains are sometimes described as having a higher XYZ, something that is never shown in the plots, and no p-value is ever given.

      (2) While the biological methods are meticulously described, statistical methods are barely mentioned in the methods section. For example, line 578, "multiple comparisons (...) were performed using the glht function of the multcomp package". What is this? What method does it use? And how was mediation analysis done? Line 575 mentions that models were compared, with no description of how this was done. Mentioning the package (or even function) is not sufficient. The package and function are an implementation; they are not the method. The actual method needs to be clearly mentioned and (at least minimally) described, in addition to having the reference for methods that are not ubiquitous (i.e., the Benjamin-Hochberg method is well-enough established to forgo this).

      (3) The methods should also be briefly introduced in the results section before describing the results of those methods.

      (4) The role of immune signaling pathways and associated phenotypes (e.g., monocyte fraction) is over-interpreted. While the differences shown are convincing, they do not convincingly show a role in either obesity or disease. The parsimonious explanation is that such changes happen as a consequence of dyslipidemia rather than a cause. It is possible that these pathways play a more direct role in this, but the authors do not present compelling evidence of this, and, failing this, the language in the text needs to be toned down.

    2. Reviewer #2 (Public review):

      This study investigated changes in metabolic health across three genetically diverse mouse strains (NZO/HlLtJ, C57BL/6J mice, CAST/EiJ) that were fed either control or high-fat high-sucrose diets. The strength of this study is the depth of metabolic phenotyping, the use of both male and female mice, and the multi-tissue metabolic analysis, including metabolic and gene expression analysis in pancreatic islets, kidney, muscle, heart, liver, and adipose tissue.

      Weaknesses include that only three mouse strains were included in this comparison, particularly given that similar comparisons have been published in the past and that the Jax lab has access to a wide range of mouse strains with diverse genetic backgrounds. Why were CAST mice included over (for example) BALB/c mice that are more commonly used in metabolic studies and are well known to show protection against diet-induced metabolic disease? Furthermore, the feeding regime was limited to 9 weeks, which may not be sufficient to evoke pronounced metabolic remodelling.

      NZO mice are well known to develop obesity. However, only approximately 50% develop type 2 diabetes and beta-cell dysfunction. How were these mice selected in the study? The results state 'Most of the male NZO mice and a few female mice displayed overt diabetes', suggesting that all mice were included irrespective of their diabetic phenotype. More information on the rationale for this is required.

      The transcriptomics data are presented in a convoluted way. As a reader, the main interest would be to determine the differences in diet-induced adaptations within each strain (e.g., why are CAST mice resistant to diet-induced metabolic defects?). However, the way Figure 4 is currently presented does not allow for this. Instead, the data are 'compressed' by looking at general changes in metabolic pathways between tissues in all three mouse strains. In addition, Figure 4E does not show the directionality of the responses within each pathway. For example, are the metabolism and inflammation pathways suppressed or activated? While more data is shown for adipose tissue, this is not sufficient.

      Currently, the metabolic cage data are separated by diet within the main figures. However, given that the diet effect is the major comparison, this needs to be rearranged, and strain differences within each diet could be shown within the supplement.

      The graphs lack labelling throughout to specify which lines/bars represent which strains and diets. This is particularly the case in the metabolic cage analysis.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Yamazaki et al. conducted multiple microscopy-based GFP localization screens, from which they identified proteins that are associated with PM/cell wall damage stress response. Specifically, the authors identified that bud-localized TMD-containing proteins and endocytotic proteins are associated with PM damage stress. The authors further demonstrated that polarized exocytosis and CME are temporally coupled in response to PM damage, and CME is required for polarized exocytosis and the targeting of TMD-containing proteins to the damage site. From these results, the authors proposed a model that CME delivers TMD-containing repair proteins between the bud tip and the damage site.

      Strengths:

      Overall, this is a well-written manuscript, and the experiments are well-conducted. The authors identified many repair proteins and revealed the temporal coordination of different categories of repair proteins. Furthermore, the authors demonstrated that CME is required for targeting of repair proteins to the damage site, as well as cellular survival in response to stress related to PM/cell wall damage. Although the roles of CME and bud-localized proteins in damage repair are not completely new to the field, this work does have conceptual advances by identifying novel repair proteins and proposing the intriguing model that the repairing cargoes are shuttled between the bud tip and the damaged site through coupled exocytosis and endocytosis.

      Weaknesses:

      While the results presented in this manuscript are convincing, they might not be sufficient to support some of the authors' claims. Especially in the last two result sessions, the authors claimed CME delivers TMD-containing repair proteins from the bud tip to the damage site. The model is no doubt highly possible based on the data, but caveats still exist. For example, the repair proteins might not be transported from one localization to another localization, but are degraded and resynthesized. Although the Gal-induced expression system can further support the model to some extent, I think more direct verification (such as FLIP or photo-convertible fluorescence tags to distinguish between pre-existing and newly synthesized proteins) would significantly improve the strength of evidence.

      Major experiment suggestions:

      (1) The authors may want to provide more direct evidence for "protein shuttling" and for excluding the possibility that proteins at the bud are degraded and synthesized de novo near the damage site. For example, if the authors could use FLIP to bleach bud-localized fluorescent proteins, and the damaged site does not show fluorescent proteins upon laser damage, this will strongly support the authors' model. Alternatively, the authors could use photo-convertible tags (e.g., Dendra) to differentiate between pre-existing repair proteins and newly synthesized proteins.

      (2) In line with point 1, the authors used Gal-inducible expression, which supported their model. However, the author may need to show protein abundance in galactose, glucose, and upon PM damage. Western blot would be ideal to show the level of full-length proteins, or whole-cell fluorescence quantification can also roughly indicate the protein abundance. Otherwise, we cannot assume that the tagged proteins are only expressed when they are growing in galactose-containing media.

      (3) Similarly, for Myo2 and Exo70 localization in CME mutants (Figure 4), it might be worth doing a western or whole-cell fluorescence quantification to exclude the caveat that CME deficiency might affect protein abundance or synthesis.

      (4) From the authors' model in Figure 7, it looks like the repair proteins contribute to bud growth. Does laser damage to the mother cell prevent bud growth due to the reduction of TMD-containing repair proteins at the bud? If the authors could provide evidence for that, it would further support the model.

      (5) Is the PM repair cell-cycle-dependent? For example, would the recruitment of repair proteins to the damage site be impaired when the cells are under alpha-factor arrest?

    2. Reviewer #2 (Public review):

      This paper remarkably reveals the identification of plasma membrane repair proteins, revealing spatiotemporal cellular responses to plasma membrane damage. The study highlights a combination of sodium dodecyl sulfate (SDS) and lase for identifying and characterizing proteins involved in plasma membrane (PM) repair in Saccharomyces cerevisiae. From 80 PM, repair proteins that were identified, 72 of them were novel proteins. The use of both proteomic and microscopy approaches provided a spatiotemporal coordination of exocytosis and clathrin-mediated endocytosis (CME) during repair. Interestingly, the authors were able to demonstrate that exocytosis dominates early and CME later, with CME also playing an essential role in trafficking transmembrane-domain (TMD) containing repair proteins between the bud tip and the damage site.

      Weaknesses/limitations:

      (1) Why are the authors saying that Pkc1 is the best characterized repair protein? What is the evidence?

      (2) It is unclear why the authors decided on the C-terminal GFP-tagged library to continue with the laser damage assay, exclusively the C-terminal GFP-tagged library. Potentially, this could have missed N-terminal tag-dependent localizations and functions and may have excluded functionally important repair proteins.

      (3) The use of SDS and laser damage may bias toward proteins responsive to these specific stresses, potentially missing proteins involved in other forms of plasma membrane injuries, such as mechanical, osmotic, etc.). SDS stress is known to indirectly induce oxidative stress and heat-shock responses.

      (4) It is unclear what the scale bars of Figures 3, 5, and 6 are. These should be included in the figure legend.

      (5) Figure 4 should be organized to compare WT vs. mutant, which would emphasize the magnitude of impairment.

      (6) It would be interesting to expand on possible mechanisms for CME-mediated sorting and retargeting of TMD proteins, including a speculative model.

    3. Reviewer #3 (Public review):

      Summary:

      This work aims to understand how cells repair damage to the plasma membrane (PM). This is important, as failure to do so will result in cell lysis and death. Therefore, this is an important fundamental question with broad implications for all eukaryotic cells. Despite this importance, there are relatively few proteins known to contribute to this repair process. This study expands the number of experimentally validated PM from 8 to 80. Further, they use precise laser-induced damage of the PM/cell wall and use live-cell imaging to track the recruitment of repair proteins to these damage sites. They focus on repair proteins that are involved in either exocytosis or clathrin-mediated endocytosis (CME) to understand how these membrane remodeling processes contribute to PM repair. Through these experiments, they find that while exocytosis and CME both occur at the sites of PM damage, exocytosis predominates in the early stages of repairs, while CME predominates in the later stages of repairs. Lastly, they propose that CME is responsible for diverting repair proteins localized to the growing bud cell to the site of PM damage.

      Strengths:

      The manuscript is very well written, and the experiments presented flow logically. The use of laser-induced damage and live-cell imaging to validate the proteome-wide screen using SDS-induced damage strengthens the role of the identified candidates in PM/cell wall repair.

      Weaknesses:

      (1) Could the authors estimate the fraction of their candidates that are associated with cell wall repair versus plasma membrane repair? Understanding how many of these proteins may be associated with the repair of the cell wall or PM may be useful for thinking about how these results are relevant to systems that do or do not have a cell wall. Perhaps this is already in their GO analysis, but I don't see it mentioned in the manuscript.

      (2) Do the authors identify actin cable-associated proteins or formin regulators associated with sites of PM damage? Prior work from the senior author (reference 26) shows that the formin Bnr1 relocalizes to sites of PM damage, so it would be interesting if Bnr1 and its regulators (e.g., Bud14, Smy1, etc) are recruited to these sites as well. These may play a role in directing PM repair proteins (see more below).

      (3) Do the authors suspect that actin cables play a role in the relocalization of material from the bud tip to PM damage sites? They mention that TMD proteins are secretory vesicle cargo (lines 134-143) and that Myo2 localizes to damage sites. Together, this suggests a possible role for cable-based transport of repair proteins. While this may be the focus of future work, some additional discussion of the role of cables would strengthen their proposed mechanism (steps 3 and 4 in Figure 7).

      (4) Lines 248-249: I find the rationale for using an inducible Gal promoter here unclear. Some clarification is needed.

    1. Reviewer #1 (Public review):

      (1) In this study, the authors aimed at characterizing Huntington's Disease (HD) - related microstructural abnormalities in the basal ganglia and thalami as revealed using Soma and Neurite Density Imaging (SANDI) indices (apparent soma density, apparent soma size, extracellular water signal fraction, extracellular diffusivity, apparent neurite density, fractional anisotropy and mean diffusivity).

      (2) The study implements a novel biophysical diffusion model that extends up-to-date methodologies and presents a significant potential for quantifying neurodegenerative processes of the grey matter of the human brain in vivo. The authors comment on the usefulness of this technique in other pathologies, but they exemplify it only with multiple sclerosis. Further development of this, building evidence, should be provided.

      (3) The study found that HD-related neurodegeneration in the striatum accounted significantly for striatal atrophy and correlated with motor impairments. HD was associated with reduced soma density, increased apparent soma size, and extracellular signal fraction in the basal ganglia, but not in the thalami. Additionally, these effects were larger at the manifest stage.

      (4) The results of this work demonstrate the impact of HD on the basal ganglia and thalami, which can be further explored as a non-invasive biomarker of disease progression. Additionally, the study shows that SANDI can be used to explore grey matter microstructure in a variety of neurological conditions.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to investigate whether advanced microstructural diffusion MRI modeling using the SANDI framework could reveal clinically relevant tissue alterations in the subcortical structures of individuals with Huntington's disease (HD). Specifically, they sought to determine if SANDI-derived parameters-such as soma density, soma size, and extracellular diffusivity-could detect abnormalities in both manifest and premanifest HD stages, complement standard MRI biomarkers (e.g., volume, MD), and correlate with disease burden and motor impairment. Through this, they hoped to demonstrate the feasibility and added biological specificity of SANDI for early detection and characterization of HD pathology.

      Strengths:

      (1) Novelty and relevance:

      This is, to the best of my knowledge, the first clinical deployment of SANDI in HD, offering more biophysically interpretable and specific imaging biomarkers than standard DTI or volumetric features.

      (2) More specific microstructural insight: Traditional approaches have used volumetric features (e.g., striatal volume loss) or DTI metrics (like FA and MD), which are indirect and non-specific markers. They can indicate something is "wrong" but not what is wrong.

      (3) SANDI parameters permit establishing clearer links with microstructure:

      o Apparent soma density (fis): proxy for neuronal/glial cell body density.

      o Apparent soma size (rs): reflects possible gliagl hypertrophy or neuronal shrinkage.

      o Neurite density (fin): linked to dendritic/axonal integrity.

      o Extracellular fraction and diffusivity: sensitive to edema, gliosis, and tissue loss.

      In this way, a decrease in soma density can be related to neural loss (e.g., medium spiny neurons), and an increase in soma size and extracellular fraction could be related to glial reactivity (astrocytes, microglia). This enables differentiating between atrophy due to neuron loss vs reactive gliosis, which volumetrics or DTI cannot do.

      (4) Integration of modalities: The inclusion of motor impairment (Q-Motor), HD-ISS staging, and multi-compartment diffusion modeling is a methodological strength.

      (5) Early detection potential: SANDI metrics showed abnormalities in premanifest HD, sometimes even when volume loss was mild or absent. This suggests the potential for earlier, more sensitive biomarkers of disease progression.

      (6) Predictive power: Regression models showed that SANDI metrics explained up to 63% of the variance in striatal volumes in HD. And this correlated strongly with motor impairment and disease burden (CAP100). This shows they are not just redundant with volume or DTI, but they are complementary and potentially more mechanistically meaningful.

      Weaknesses:

      Certain aspects of the study would benefit from clarification:

      (1) Scanner and acquisition consistency: While HD data are from the WAND study, it is not clear whether controls were scanned on the same scanner or protocol. Given the use of model-derived metrics (especially SANDI), differences in scanner or acquisition could introduce confounds. Also, although it offers novel and biologically informative markers, widespread clinical translation still faces hurdles. For instance, the study used a 3T Connectom scanner (300mT/m gradients), which is not widely available. Reproduction of these results in standard 3T clinical scanners would be a great addition, in scenarios with lower resolution, less precise parameter recovery, and longer scans if SNR needs to be maintained.

      (2) HD-ISS staging and group comparisons:<br /> a) Only 26-27 out of 56 gene-positive participants could be assigned HD-ISS stages, and none were classified into stages 0 or 4.

      b) Visual overlap between stages 1 and 2 in behavioral and imaging features suggests that staging-based group separation may not be robust.

      c) The above may lead to claims based on progression across HD-ISS stages to be overinterpreted or underpowered

      (3) Regression modeling choices:<br /> a) SANDI metrics included in the models differ between HC and HD groups, reducing comparability.

      b) The potential impact of multicollinearity (e.g., between fis and rs) is not discussed.

      c) Beta coefficients could reflect model instability or parameter degeneracy rather than true biological effects.

      These issues do not undermine the study's main conclusions, which effectively demonstrate the feasibility and initial clinical relevance of applying SANDI to HD. Nonetheless, addressing them more thoroughly would enhance the clarity and interpretability of the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      Ioakeimidis and colleagues studied microstructural abnormalities in N=56 Huntington's disease (HD) patients compared to N=57 normative controls. The authors used a powerful MRI Connectom scanner and applied the SANDI model to estimate the soma size, neurite size, soma density, and extracellular fraction in key subcortical nuclei related to HD. In the striatum, they found decreased soma density and increased soma size, which also seemed to become more pronounced in advanced HD individuals in the final exploratory analyses. The authors conducted important analyses to find whether the SANDI measures correlate with clinical scores (i.e., QMotor) and whether the variance of the striatal volume is explained by the SANDI measures. They found a relationship between SANDI measures for both.

      Strengths:

      The study is both innovative and of high interest for the HD community. The authors provide a rich pool of statistical analyses and results that anticipate the questions that may emerge in the HD research community. Statistics are carefully chosen and image processing is done with state-of-the-art methods and tools. The sample size gives sufficient credibility to the findings. Altogether, I think this study sets a milestone in the attempts of the HD community to understand neuropathological processes with non-invasive methods, and extends the current knowledge of microstructural anomalies identified in HD with diffusion MRI. More importantly, the newly identified anomalies in soma size and soma density open new avenues for studying these biological effects further and perhaps developing these biomarkers for use in clinical trials.

      Weaknesses:

      (1) An important question is whether the SANDI measures, which require an expensive scanner and elaborate processing, are better biomarkers than the more traditional DTI measures. Can the authors compare the effect size of FA/MD with SANDI measures? In some of the plots and tables, FA/MD seem to have comparable, if not higher, correlations with QMotor or CAP scores. On the same vein, it is unclear whether DTI measures were included in hierarchical stepwise regression. I wonder if the stepwise models may have picked up FA/MD instead of SANDI measures if they are given a chance. Overall, I hope the authors can discuss their findings also in this light of cost vs. benefit of adopting SANDI in future studies, which is an important topic for clinical trials.

      (2) Similar to the above point, it is very important to consider how strong the biomarking signal is from SANDI measures compared to the good old striatal volume. Some plots seem to indicate that volumes still have the highest correlation with QMotor and the highest effect size in group comparisons. It would be helpful for the community to know where the new SANDI measures stand compared to the most typically used volumes in terms of effect size.

      (3) The diffusion measures are inevitably correlated to some degree. Please provide a correlation matrix in the supplementary material, including all DWI measures, to enable readers to better understand how similar SANDI measures are to each other or vs. other DTI measures. Perhaps adding volumes to this correlation matrix may also be a good future reference.

      (4) ISS stages:

      a) The online ISS calculator requires cut-offs derived from the longitudinal Freesurfer pipeline, while the authors do not have longitudinal data. Thus, the ISS classification might be inaccurate to some degree if the authors used the FS cross-sectional pipeline. Please review this issue and see if updated cut-offs should be used to classify participants.

      b) Were there really no participants with ISS 0 among the 56 HD individuals? Please clarify in the manuscript.

      (5) A note on terminology that might be confusing to some readers. According to the creators of ISS, the ISS stages are created for research only; they are not used or applied in the clinic. On the other hand, the terms "premanifest" and "manifest" have a clinical meaning, typically based on the diagnostic confidence level. The assignment of ISS0-1 to premanifest and ISS2-3 to manifest may create some non-trivial confusion, if not opposition, in some segments of the HD community. The authors can keep their current terminology, but will need to at least clarify to the reader that this assignment is speculative, does not fully match the clinically-based categories, and should not be confused with similarly named groups in the previous literature.

    4. Author response:

      Response to Reviewer 1:

      Ad (2) Clinical applications of SANDI have primarily focused on Multiple Sclerosis. However, since the preparation of the manuscript, one study has been published reporting reductions in apparent soma density and white and grey matter differences in apparent soma size in amyotrophic lateral sclerosis (ALS) (https://doi.org/10.1016/j.ejrad.2025.111981). We will include this paper in our revised manuscript.

      Responses to Reviewer 2:

      Strength:

      Ad (3) SANDI cannot directly differentiate between neural and glia cells but the pattern of differences in the SANDI parameters we observed in Huntington’s disease (HD) are consistent with the known pathology in HD.

      Weaknesses:

      Ad (1) With regards to the question about scanner and acquisition consistency, we can confirm that all diffusion data of individuals with HD and healthy controls from the WAND study were acquired with the same multi-shell High Angular Resolution Diffusion Imaging (HARDI) protocol on the 3T Connectom scanner at CUBRIC. Thus, all diffusion data analysed and reported in this manuscript were acquired with the same protocol on the same strong gradient MRI system for harmonization and consistency purposes.

      We agree that for clinical adoption it is important to demonstrate that HD-related SANDI differences do not require ultra-strong gradient imaging and can be detected on standard clinical MRI systems. While we have not collected such data in people with HD, we and others have demonstrated the feasibility of modelling SANDI metrics from multi-shell diffusion-weighted imaging data acquired with maximum b-value 3,000 s/mm2 on clinical 3T MRI system in typical adults and people with MS or ALS (https://doi.org/10.1002/hbm.26416, https://doi.org/10.1038/s41598-024-60497-6, https://doi.org/10.1016/j.ejrad.2025.111981). These studies have demonstrated that it is feasible to characterise brain microstructural differences with SANDI on clinical scanners and that comparable patterns of results can be observed across different MRI systems. It should also be noted that there is presently a move towards stronger gradient implementation in clinical systems as demonstrated by the release of the Siemens Cima.X system which will allow higher b-value diffusion scanning on clinical systems. 

      ad (2) We agree that due to the small number of HD participants with HD-ISS staging the exploratory comparisons between ISS stages need to be interpreted with caution. We hope to gain access to some of the missing ISS information and plan to include these in the revised paper.

      Ad (3) With regards to the queries about the regression modelling choices:

      (1) As SANDI metrics differed between HC and HD groups, and hence may not be directly comparable, separate regression models for HC and HD data were conducted without formal comparisons between slopes. Only descriptive exploratory comparisons of the observed pattern were included.

      (2) We will provide cross-correlational analyses between all SANDI parameters in the supplements of the revised version of the paper to check for multicollinearity.

      (3)All model-based approaches, including SANDI, may be prone to model instability or parameter degeneracy and we will acknowledge and discuss this in the revised version.

      Responses to Reviewer 3:

      Weaknesses: 

      Ad (1) and (2) The effect sizes (ES) of group differences in SANDI, DTI, and volume measures in the caudate and putamen (Tables 3 and 4) were broadly comparable: apparent soma radius rs (rrb = 0.45 -0.53), apparent soma size fis (rrb = 0.32 -0.45), FA (rrb = 0.38 -0.55), MD (rrb = 0.51 -0.61) and volumes (rrb = 0.49 -0.55 ). Similar ES were observed between fis and FA, and between rs and volumes. MD showed the largest ES, likely due to striatal atrophy-related CSF partial volume contamination.Cost-benefit analyses of imaging marker choices in clinical trials depend on the aim of the study. DTI provides sensitive but unspecific indices that are influenced by biological and geometrical tissue properties and capture a multitude of microstructural properties. Similarly, volumetric measurements do not inform about the underpinning neurodegenerative processes.

      With the advancement of disease-modifying therapies for HD it has become important to identify non-invasive imaging markers that can inform about the mechanistic effects of novel therapies. While DTI and volume metrics are sensitive to detect brain changes, they do not provide specific information about the underpinning tissue properties. Such information, however, may turn out to be important for the evaluation of mechanistic effects of novel therapeutics in clinical trials. Advanced microstructural models such as SANDI may help provide such information. We found that SANDI indices had statistically similar power to the gold standard measures of volumes, but with the added value of information underpinning microstructure. We and others have also shown that SANDI can be applied to multi-shell diffusion data acquired in a clinically feasible time (~10 min) on standard 3T MRI systems (please refer to our response above).

      To summarise, DTI and volumes are sensitive to brain changes but will need to be complemented by more advanced microstructural measurements such as SANDI to gain a better understanding of the underlying tissue changes and effects of disease-modifying therapies.

      Ad (3) We will provide a correlation matrix of all DWI measures in supplementary material to allow a better understanding how similar SANDI measures are to each other and compared to DTI measures. 

      Ad (4) Most of the people with HD who have taken part in our study were participants in the Enroll-HD study. We will use HD-ISS information from ENROLL as much as possible. As we do not have longitudinal imaging data for all individuals classified as ISS <2, we will compare our cross-sectional striatal volumes with those from age and sex matched individuals from WAND to determine whether people fall into ISS 0 or 1 category. This approach will hopefully allow us to increase the total HD-ISS sample size and to determine whether there were participants with ISS 0 in our sample.

      Ad (5) We will explain in the revised manuscript that ISS stages are created for research only purposes and are not used or applied in clinic, while “premanifest” and “manifest” are helpful concepts in the clinical context. We will clarify that we refer to individuals without motor symptoms as assessed with Total Motor Score (TMS) as premanifest and to those with motor symptoms as manifest. This roughly corresponds to individuals at ISS 0/1 without signs of motor symptoms compared to individuals at ISS 2-3 with signs of motor symptoms.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The small conductance calcium-activated potassium channel 2 (SK2) is an important drug target for treating neurological and cardiovascular diseases. However, structural information on this subtype of SK channels has been lacking, and it has been diOicult to draw conclusions about activator and inhibitor binding and action in the absence of structural information.

      Here the authors set out to (1) determine the structure of the transmembrane regions of a mammalian SK2 channel, (2) determine the binding site of apamin, a historically important SK2 inhibitor whose mode of action is unclear, and (3) use the structural information to generate a novel set of activators/inhibitors that selectively target SK2.

      The authors largely achieved all the proposed goals, and they present their data clearly.

      Unable to solve the structure of the human SK2 due to excessive heterogeneity in its cytoplasmic regions, the authors create a chimeric construct using SK4, whose structure was previously solved, and use it for structural studies. The data reveal a unique extracellular structure formed by the S2-S3 loop, which appears to directly interact with the selectivity filter and modulate its conductivity. Structures of SK2 in the absence and presence of the activating Ca2+ ions both possess non-K+-selective/conductive selectivity filters, where only sites 3 and 4 are preserved. The S6 gates are captured in closed and open states, respectively. Apamine binds to the S2-S3 loop, and unexpectedly, induces a K+ selective/conductive conformation of the selectivity filter while closing the S6 gate.

      Through high-throughput screening of small compound libraries and compound optimization, the group identified a reasonably selective inhibitor and a related compound that acts as an activator. The characterization shows that these compounds bind in a novel binding site. Interestingly, the inhibitor, despite binding in a site diOerent from that of apamine, also induces a K+ selective/conductive conformation of the selectivity filter while the activator induces a non-K+ selective/conductive conformation and an open S6 gate.

      The data suggest that the selectivity filter and the S6 gate are rarely open at the same time, and the authors hypothesize that this might be the underlying reason for the small conductance of SK2. The data will be valuable for understanding the mechanism of SK2 channel (and other SK subtypes).

      Overall, the data is of good quality and supports the claims made by the authors. However, a deeper analysis of the cryo-EM data sets might yield some important insights, i.e., about the relationship between the conformation of the selectivity filter and the opening of the S6 gate.

      We attempted focused 3D classification to identify subsets of particles with the S6 open and the SF in a conductive state but were not able to isolate such a particle class. This indicates that either none or a very small percentage of particles exists in a fully conductive state. This sentence was included in the results section: 

      “Focused 3D classification of the S3-S4 linker was unsuccessful in identifying particles subsets with a dilated extracellular constriction suggesting that either none or a very small percentage of Ca<sup>2+</sup>-bound SK2-4 is in a conductive state”

      Some insight and discussion about the allosteric networks between the SF and the S6 gate would also be a valuable addition.

      The extracellular constriction is in the same non-conductive conformation in the Ca<sup>2+</sup> bound and Ca<sup>2+</sup> -free SK2-4 structures suggesting that the conformation of S3-S4 linker/SF and the S6 are not allosterically coupled. We predict that Ca<sup>2+</sup> opens the intracellular gate and another physiological factor (not yet identified) promotes extracellular gate opening. These sentences were added to the results and discussion: “This along with the similar conformation of the S3-S4 linker in the Ca<sup>2+</sup> -bound and Ca<sup>2+</sup> -free states of SK2-4 suggest that Ca<sup>2+</sup> -dependent intracellular gate dynamics are not coupled to the conformation of the S3-S4 linker. Other yet to be identified physiological factors may be required to dilate the extracellular constriction.”

      “Alternatively, other physiological factors, such as PIP2[46,47] or protein-protein interactions[48-50], may exist in live cells that modulate the interaction between S3-S4 linker and the selectivity filter.”

      Reviewer #2 (Public review):

      Summary:

      The authors have used single-particle cryoEM imaging to determine how small-molecule regulators of the SK channel interact with it and modulate their function.

      Strengths:

      The reconstructions are of high quality, and the structural details are well described.

      Weaknesses:

      The electrophysiological data are poorly described. Several details of the structural observations require a mechanistic context, perhaps better relating them to what is known about SK channels or other K channel gating dynamics.

      As recommended, additional details for electrophysiological data were added to the results, methods, and figure legends for clarification.  

      The most pressing point I have to make, which could help improve the manuscript, relates to the selectivity filter (SF) conformation. Whether the two ion-bound state of SK2-4 (Figure 4A) represents a non-selective, conductive SF occluded by F243 or represents a C-type inactivated SF, further occluded by F243, is unclear. It would be important to discuss this. Reconstructions of Kv1.3 channels also feature a similar configuration, which has been correlated to its accelerated C-type inactivation.

      Structural overlays of Ca<sup>2+</sup> bound SK2-4, HCN, and C-type inactivated Kv1.3 selectivity filters demonstrate that each have conformational diVerences and it is diVicult to definitively determine if the SK2-4 selectivity filter is in a non-selective conformation like HCN or a C-type inactivated conformation like Kv1.3. Based on the number of ions observed in the filter and the position of Tyr361 we believe the selectivity filter most closely resembles that of HCN. Importantly, the selectivity filter conformation observed in the SK2-4 Ca<sup>2+</sup> -bound and Ca<sup>2+</sup> -free structures is ultimately nonconductive due to the Phe243 extracellular constriction blocking K<sup>+</sup> eVlux. 

      A comparison of the SK2-4 selectivity filter to HCN and C-type inactivated Kv1.3 was included in Figure 4 and this sentence was included in the results section:

      “The selectivity filter of SK2-4 resembles that of to HCN in both the position of Tyr361 and the number of K<sup>+</sup> coordination sites (Fig 4E,F,G,H)”

      Furthermore, binding of a toxin derivative to Kv1.3 restores the SF into a conductive form, though occluded by the toxin. It appears that apamin binding to SK2-4 might be doing something similar. Although I am not sure whether SK channels undergo C-type inactivation like gating, classical MTS accessibility studies have suggested that dynamics of the SF might play a role in the gating of SK channels. It would be really useful (if not essential) to discuss the SF dynamics observed in the study and relate them better to aspects of gating reported in the literature.

      Extracellular toxin binding to SK2-4 and K<sub>v</sub>1.3 induce a conformational change in the selectivity filter to produce a canonical K<sup>+</sup> selective structure with four coordination sites. However, the mechanism by which the toxins produce the conformational change is diVerent. For SK2-4, apamin interacts primarily with S3-S4 linker residues and induces a shift in the S3-S4 linker away from the pore axis. This in turn prevents the hydrogen bonds between Arg240 and Tyr245 of the S3-S4 linker and Asp363 at the C-terminus of the selectivity filter to produce a selectivity filter conformation with four K<sup>+</sup> coordination sites. For K<sub>v</sub>1.3, the sea anemone toxin ShK binds directly to the C-terminus of the selectivity filter disrupting interactions required for the C-type inactivated structure and thereby inducing the conformational change. These sentences were added to the results:

      “Toxin induced selectivity filter conformational change has also been reported for K<sub>v</sub 1.3 with the sea anemone toxin ShK. However, unlike apamin binding to SK2-4, ShK binds directly to the K<sub>v</sub> 1.3 selectivity filter to convert a C-type inactivated conformation to a canonical K<sup>+</sup> selective structure with four coordination sites [39,40]. The change in selectivity filter conformation in apamin-bound SK2-4 seems to be driven instead by the weakening of interactions between the selectivity filter and the S3-S4 linker.”

      The SF of K channels, in conductive states, are usually stabilized by an H-bond network involving water molecules bridged to residues behind the SF (D363 in the down-flipped conformation and Y361). Considering the high quality of the reconstructions, I would suspect that the authors might observe speckles of density (possibly in their sharpened map) at these sites, which overlap with water molecules identified in high-resolution X-ray structures of KcsA, MthK, NaK, NaK2K, etc. It could be useful to inspect this region of the density map.

      We did not observe strong density near Y361 or D363 that could be confidently model as water. However, in the structures of SK2-4 bound to apamin and compound 1 Tyr361 in the selectivity filter rotates 180° and forms a hydrogen bond with Thr355 in the pore helix. The homologous hydrogen bond is also observed in SK4 and the conductive/ K<sup>+</sup> selective selectivity filter conformation of Kv1.3.  The rotation of Tyr361 to form a hydrogen bond with Thr355, reorientation of Asp363 and Trp350 into hydrogen bonding position, and the presence of four K<sup>+</sup> coordination sites upon binding of apamin and compound 1 strongly suggest that the selectivity filter is in a K<sup>+</sup> selective/conductive conformation. The Tyr361/Thr355 hydrogen bond is now described in the paper and shown in Figures 4D, 5D, and S6F.

      Reviewer #3 (Public review):

      This is a fundamentally important study presenting cryo-EM structures of a human small conductance calcium-activated potassium (SK2) channel in the absence and presence of calcium, or with interesting pharmacological probes bound, including the bee toxin apamin, a small molecule inhibitor, and a small molecule activator. As eOorts to solve structures of the wild-type hSK2 channel were unsuccessful, the authors engineered a chimera containing the intracellular domain of the SK4 channel, the subtype of SK channel that was successfully solved in a previous study (reference 13). The authors present many new and exciting findings, including opening of an internal gate (similar to SK4), for the first time resolving the S3-S4 linker sitting atop the outer vestibule of the pore and unanticipated plasticity of the ion selectivity filter, and the binding sites for apamin, one new small molecule inhibitor and another small molecule activator. Appropriate functional data are provided to frame interpretations arising from the structures of the chimeric protein; the data are compelling, the interpretations are sound, and the writing is clear. This high-quality study will be of interest to membrane protein structural biologists, ion channel biophysicists, and chemical biologists, and will be valuable for future drug development targeting SK channels.

      The following are suggestions for strengthening an already very strong and solid manuscript:

      (1) It would be good to include some information in the text of the results section about the method and configuration used to obtain electrophysiological data and the limitations. It is not until later in the text that the Qube instrument is mentioned in the results section, and it is not until the methods section that the reader learns it was used to obtain all the electrophysiological data. Even there, it is not explicitly mentioned that a series of diOerent internal solutions were used in each cell where the free calcium concentration was varied to obtain the data in Figure1C. Also, please state the concentration of free calcium for the data in Figure 1B.

      As recommended, additional details for electrophysiological data were added to the results, methods, and figure legends for clarification.  

      (2) The authors do a nice job of discussing the conformations of the selectivity filter they observed here in SK as they relate to previous work on NaK and HCN, but from my perspective the authors are missing an opportunity to point out even more striking relationships with slow C-type inactivation of the selectivity filter in Shaker and Kv1 channels. C-type inactivation of the filter in Shaker was seen in 150 mM K using the W434F mutant (PMC8932672) or in 4 mM K for the WT channel (PMC8932672), and similar results have been reported for Kv1.2 (PMC9032944; PMC11825129) and for Kv1.3 (PMC9253088; PMC8812516) channels. For Kv1.3, C-type inactivation occurs even in 150 mM K (PMC9253088; PMC8812516). Not unlike what is seen here with apamin, binding of the sea anemone toxin (ShK) with a Fab attached (or the related dalazatide) inserts a Lys into the selectivity filter and stabilizes the conducting conformation of Kv1.3 even though the Lys depletes occupancy of S1 by potassium (PMC9253088; PMC8812516). Or might the conformation of the filter be controlled by regulatory processes in SK2 channels? I think connecting the dots here would enhance the impact of this study, even if it remains relatively speculative.

      Please see the response to reviewer 2’s comments for a comparison of the selectivity filter structure between SK2-4 and C-type inactivated K<sub>v</sub>1.3 and a discussion of toxin induced selectivity filter conformational change.

      What is known about how the functional properties of SK2 channels (where the filter changes conformation) diOer from SK4, where the filter remains conducting (reference 13)? Is there any evidence that SK2 channels inactivate?

      Compared with SK4, SK2 has some unique properties such as lower conductance and the ability to switch between low- and high-open probability states. Mutation of Phe243 suggests that the S3-S4 linker conformation contributes to the low conductance. This is included in the discussion.

      “Such a mechanism may explain some properties of SK2 that are not observed in SK4, which lacks an S3-S4 linker, such as its low conductance (~10 pS) and the ability to switch between low- and high-open probability states[3,4]. Indeed, mutation of Phe243 in rat SK2 produced a 2-fold increase in channel conductance[5].”

      Or might the conformation of the filter be controlled by regulatory processes in SK2 channels? I think connecting the dots here would enhance the impact of this study, even if it remains relatively speculative.

      Please see the response to reviewer 1’s comments for a discussion of the potential physiological role of the S3-S4 linker/extracellular constriction and its mechanism for opening.

      Reviewer #1 (Recommendations for the authors):

      I enjoyed reading your paper and am intrigued by your findings on the selectivity filter of SK2. I've got a few recommendations for data analysis and a couple of questions that might contribute to the discussion.

      In your Ca2+-bound dataset, have you tried to parse out any alternative conformations (e.g., by using 3D classification, or 3D variability)? Do you think there might be a small(er) population of particles that adopt a fully open conformation? If you haven't done this already, I would recommend doing so. You have a rather large number of particles in your final 3D reconstruction (~660k), so there might be some hidden conformations that could contribute to our understanding of the system.

      I would recommend doing the same for your compound 4-bound data set.

      Please see above for response to this recommendation.

      Do you think apamine works solely as a pore blocker, or does its binding perhaps also aOect the S6 gate via allosteric networks (perhaps the same ones that induce the formation of the K+ conductive SF through binding of compound 1 above the S6 gate?)?

      Apamin binding does not change the conformation of the pore helices (S5 or S6) and thus we believe it acts primarily as a pore blocker. The following was added to the results section:

      “Overall, the apamin-bound SK2-4/CaM structure resembles Ca<sup>2+</sup>-bound SK2-4. The Nterminal lobe of CaM engages with the S<sub>45</sub> A helix, the S5 and S6 helices adopt a similar conformation, and the intracellular gate Val390 is open with a radius of 3.5 Å (Fig 2D). The most significant conformational change is in the position of the S3-S4 linker, which shifts ~2 Å away from the pore axis to accommodate apamin binding.”

      Is there a mechanistic explanation for why it might be diOicult/energetically costly for the SF to be conductive and the S6 gate to be open at the same time?

      Not to our knowledge.

      I also have these minor recommendations:

      -In all figures showing density, include the threshold/sigma value at which density is shown.

      -For all ligands and ions, include half-map data.

      Sigma values were added for all figures legends displaying cryoEM density. The displayed maps are the sharpened full maps.

      Reviewer #2 (Recommendations for the authors):

      Is it possible to provide a structure-sequence guided explanation for the diOerent aOinity of compound 1 for SK2 vs SK4?

      Yes. The following is now included in the results section and a panel was added to Figure S6D.

      “However, for SK4 Thr212 replaces SK2 Ser318 and Trp216 (homologous to SK2 Trp322) is conserved but adopts a diVerent rotamer conformation (Fig S6D). Both changes occlude the compound 1 binding site in SK4 and would likely reduce compound 1 potency on SK4 as observed in the functional data.”

      Is it possible to propose a model of modulation by compound 1/4 where the authors can comment on the conformational dependence of compound binding? That is, do they bind exclusively to the identified conformational states of the channel, or are they able to bind to both closed and open channels, but bias one state over the other?

      The clash between compound 1 and Thr386 in the open conformation of the S6 helices suggests that compound 1 would preferentially bind to closed state of SK2. Similarly, the clash between compound 4 and Ile380 in the closed conformation of the S6 helices suggests that compound 4 would preferentially bind to the open state of SK2. This was included in the discussion:

      “This proposed mechanism of modulation suggests that compound 1 may bind preferentially to the closed conformation of the S6 helices and compound 4 may bind preferentially to the open conformation of the S6 helices.” 

      Please provide the calcium concentration used to generate the data in Figure 1B. The calcium concentration is now stated in the legend for Fig 1B:

      “Intracellular solution contains 2 µM Ca<sup>2+</sup> based on calculation using Maxchelator (see methods)”

      Essential and critically important descriptions of experiments in Figure 7A are lacking. It would be essential to describe properly, with care, what the currents and the conditions of measurements are. If these currents are obtained by subtracting leak currents by adding other drugs, it would be good to comment on whether the latter compete with compounds 1/4.

      As recommended, additional details for electrophysiological data were added to the results, methods, and figure legends for clarification. SK currents were obtained by subtracting leak currents by adding UCL1684 only at the end of experiments. UCL1684 is not expected to interfere with eVect of compound 1 or 4 given diVerent binding sites and mechanisms.  

      If Compound 1 changes the structure of the SF (Figure 6F), would it also promote apamin binding? Given that both these agents produce a similar change in the SF, could each favor the binding of the other?

      Since apamin binds to the S3-S4 linker it is unlikely that the selectivity filter conformational change observed in the compound 1 bound structure would aVect apamin binding.

    1. Reviewer #2 (Public review):

      This manuscript from Hariharan, Shi, Viner, and Guan presents x-ray crystallographic structures of membrane protein MelB and HDX-MS analysis of ligand-induced dynamics. This work improves on the resolution of previously published structures, introduces further sugar-bound structures, and utilises HDX to explore in further depth the previously observed positive cooperatively to cotransported cation Na+. The work presented here builds on years of previous study and adds substantial new details into how Na+ binding facilitates melibiose binding and deepens the fundamental understanding of the molecular basis underlying the symport mechanism of cation-coupled transporters. However, the presentation of the data lacks clarity, and in particular, the HDX-MS data interpretation requires further explanation in both methodology and discussion.

      Comments on Crystallography and biochemical work:

      (1) It is not clear what Figure 2 is comparing. The text suggests this figure is a comparison of the lower resolution structure to the structure presented in this work; however, the figure legend does not mention which is which, and both images include a modelled water molecule that was not assigned due to poor resolution previously, as stated by the authors, in the previously generated structure. This figure should be more clearly explained.

      (2) It is slightly unclear what the ITC measurements add to this current manuscript. The authors comment that raffinose exhibiting poor binding affinity despite having more sugar units is surprising, but it is not surprising to me. No additional interactions can be mapped to these units on their structure, and while it fits into the substrate binding cavity, the extra bulk of additional sugar units is likely to reduce affinity. In fact, from their listed ITC measurements, this appears to be the trend. Additionally, the D59C mutant utilised here in structural determination is deficient in sodium/cation binding. The reported allostery of sodium-sugar binding will likely influence the sugar binding motif as represented by these structures. This is clearly represented by the authors' own ITC work. The ITC included in this work was carried out on the WT protein in the presence of Na+. The authors could benefit from clarifying how this work fits with the structural work or carrying out ITC with the D59C mutant, or additionally, in the absence of sodium.

      Comments on HDX-MS work:

      While the use of HDX-MS to deepen the understanding of ligand allostery is an elegant use of the technique, this reviewer advises the authors to refer to the Masson et al. (2019) recommendations for the HDX-MS article (https://doi.org/10.1038/s41592-019-0459-y) on how to best present this data. For example:

      (1) The Methodology includes a lipid removal step. Based on other included methods, I assumed that the HDX-MS was being carried out in detergent-solubilised protein samples. I therefore do not see the need for a lipid removal step that is usually included for bilayer reconstituted samples. I note that this methodology is the same as previously used for MelB. It should be clarified why this step was included, if it was in fact used, aka, further details on the sample preparation should be included.

      (2) A summary of HDX conditions and results should be given as recommended, including the mean peptide length and average redundancy per state alongside other included information such as reaction temperature, sequence coverage, etc., as prepared for previous publications from the authors, i.e., Hariharan et al., 2024.

      (3) Uptake plots per peptide for the HDX-MS data should be included as supporting information outside of the few examples given in Figure 6.

      (4) A reference should be given to the hybrid significance testing method utilised. Additionally, as stated by Hageman and Weis (2019) (doi:10.1021/acs.analchem.9b01325), the use of P < 0.05 greatly increases the likelihood of false positive ΔD identifications. While the authors include multiple levels of significance, what they refer to as high and lower significant results, this reviewer understands that working with dynamic transporters can lead to increased data variation; a statement of why certain statistical criteria were chosen should be included, and possibly accompanied by volcano plots. The legend of Figure 6 should include what P value is meant by * and ** rather than statistically significant and highly statistically significant.

      (5) Line 316 states a significant difference in seen in dynamics, how is significance measured here? There is no S.D. given in Table S4. Can the authors further comment on the potential involvement in solvent accessibility and buried helices that might influence the overall dynamics outside of their role in sugar vs sodium binding? An expected low rate of exchange suggests that dynamics are likely influenced by solvent accessibility or peptide hydrophobicity? The increased dynamics at peptides covering the Na binding site on overall more dynamic helices suggests that there is no difference between the dynamics of each site.

      (6) Previously stated HDX-MS results of MelB (Hariharan et al., 2024) state that the transmembrane helices are less dynamic than polypeptide termini and loops with similar distributions across all transmembrane bundles. The previous data was obtained in the presence of sodium. Does this remove the difference in dynamics in the sugar-binding helices and the cation-binding helices? Including this comparison would support the statement that the sodium-bound MelB is more stable than the Apo state, along with the lack of deprotection observed in the differential analysis.

      (7) Have the authors considered carrying out an HDX-MS comparison between the WT and the D59C mutant? This may provide some further information on the WT structure (particularly a comparison with sugar-bound). This could be tied into a nice discussion of their structural data.

      (8) Have the authors considered utilising Li+ to infer how cation selectivity impacts the allostery? Do they expect similar stabilisation of a higher-affinity sugar binding state with all cations?

      (9) MD of MelB suggests all transmembrane helices are reorientated during substrate translocation, yet substrate and cotransporter ligand binding only significantly impacts a small number of helices. Can the authors comment on the ensemble of states expected from each HDX experiment? The data presented here instead shows overall stabilisation of the transporter. This data can be compared to that of HDX on MFS sugar cation symporter XylE, where substrate binding induces a transition to OF state. There is no discussion of how this HDX data compares to previous MFS sugar transporter HDX. The manuscript could benefit from this comparison rather than a comparison to LacY. It is unlikely that there are universal mechanisms that can be inferred even from these model proteins. Highlighting differences instead between these transport systems provides broader insights into this protein class. Doi: 10.1021/jacs.2c06148 and 10.1038/s41467-018-06704-1.

      (10) Additionally, the recent publication of SMFS data (by the authors: doi:10.1016/j.str.2022.11.011) states the following: "In the presence of either melibiose or a coupling Na+-cation, however, MelB increasingly populates the mechanically less stable state which shows a destabilized middle-loop C3." And "In the presence of both substrate and co-substrate, this mechanically less stable state of MelB is predominant.". It would benefit the authors to comment on these data in contrast to the HDX obtained here. Additionally, is the C3 loop covered, and does it show the destabilization suggested by these studies? HDX can provide a plethora of results that are missing from the current analysis on ligand allostery. The authors instead chose to reference CD and thermal denaturation methods as comparisons.

    2. Reviewer #3 (Public review):

      Summary:

      The melibiose permease from Salmonella enterica serovar Typhimurium (MelBSt) is a member of the Major Facilitator Superfamily (MFS). It catalyzes the symport of a galactopyranoside with Na⁺, H⁺, or Li⁺, and serves as a prototype model system for investigating cation-coupled transport mechanisms. In cation-coupled symporters, a coupling cation typically moves down its electrochemical gradient to drive the uphill transport of a primary substrate; however, the precise role and molecular contribution of the cation in substrate binding and translocation remain unclear. In a prior study, the authors showed that the binding affinity for melibiose is increased in the presence of Na+ by about 8-fold, but the molecular basis for the cooperative mechanism remains unclear. The objective of this study was to better understand the allosteric coupling between the Na+ and melibiose binding sites. To verify the sugar-recognition specific determinants, the authors solved the outward-facing crystal structures of a uniport mutant D59C with four sugar ligands containing different numbers of monosaccharide units (α-NPG, melibiose, raffinose, or α-MG). The structure with α-NPG bound has improved resolution (2.7 Å) compared to a previously published structure and to those with other sugars. These structures show that the specificity is clearly directed toward the galactosyl moiety. However, the increased affinity for α-NPG involves its hydrophobic phenyl group, positioned at 4 Å-distance from the phenyl group of Tyr26 forms a strong stacking interaction. Moreover, a water molecule bound to OH-4 in the structure with α-NPG was proposed to contribute to the sugar recognition and appears on the pathway between the two specificity-determining pockets. Next, the authors analyzed by hydrogen-to-deuterium exchange coupled to mass spectrometry (HDX-MS) the changes in structural dynamics of the transporter induced by melibiose, Na+, or both. The data support the conclusion that the binding of the coupling cation at a remote location stabilizes the sugar-binding residues to switch to a higher-affinity state. Therefore, the coupling cation in this symporter was proposed to be an allosteric activator.

      Strengths:

      (1) The manuscript is generally well written.

      (2) This study builds on the authors' accumulated knowledge of the melibiose permease and integrates structural and HDX-MS analyses to better understand the communication between the sodium ion and sugar binding sites. A high sequence coverage was obtained for the HDX-MS data (86-87%), which is high for a membrane protein.

      Weaknesses:

      (1) I am not sure that the resolution of the structure (2.7 Å) is sufficiently high to unambiguously establish the presence of a water molecule bound to OH-4 of the α-NPG sugar. In Figure 2, the density for water 1 is not obvious to me, although it is indeed plausible that water mediates the interaction between OH4/OH6 and the residues Q372 and T373.

      (2) Site-directed mutagenesis could help strengthen the conclusions of the authors. Would the mutation(s) of Q372 and/or T373 support the water hypothesis by decreasing the affinity for sugars? Mutations of Thr 121, Arg 295, combined with functional and/or HDX-MS analyses, may also help support some of the claims of the authors regarding the allosteric communication between the two substrate-binding sites.

      (3) The main conclusion of the authors is that the binding of the coupling cation stabilizes those dynamic sidechains in the sugar-binding pocket, leading to a high-affinity state. This is visible when comparing panels c and a from Figure S5. However, there is both increased protection (blue, near the sugar) and decreased protection in other areas (red). The latter was less commented, could the increased flexibility in these red regions facilitate the transition between inward- and outward-facing conformations? The HDX changes induced by the different ligands were compared to the apo form (see Figure S5). It might be worth it for data presentation to also analyze the deuterium uptake difference by comparing the conditions sodium ion+melibiose vs melibiose alone. It would make the effect of Na+ on the structural dynamics of the melibiose-bound transporter more visible. Similarly, the deuterium uptake difference between sodium ion+melibiose vs sodium ion alone could be analyzed too, in order to plot the effect of melibiose on the Na+-bound transporter.

      (4) For non-specialists, it would be beneficial to better introduce and explain the choice of using D59C for the structural analyses.

      (5) In Figure 5a, deuterium changes are plotted as a function of peptide ID number. It is hardly informative without making it clearer which regions it corresponds to. Only one peptide is indicated (213-226), I would recommend indicating more of them in areas where deuterium changes are substantial.

      (6) From prior work of the authors, melibiose binding also substantially increases the affinity of the sodium ion. Can the authors interpret this observation based on the HDX data?

    3. Author response:

      Reviewer #1:

      While the structure of the melibiose permease in both outward and inward-facing forms has been solved previously, there remain unanswered questions regarding its mechanism. Hariharan et al set out to address this with further crystallographic studies complemented with ITC and hydrogen-deuterium exchange (HDX) mass spectrometry.

      They first report 4 different crystal structures of galactose derivatives to explore molecular recognition, showing that the galactose moiety itself is the main source of specificity. Interestingly, they observe a water-mediated hydrogen bonding interaction with the protein and suggest that this water molecule may be important in binding.

      We appreciate the understanding of our work presented in this manuscript by this reviewer.

      The results from the crystallography appear sensible, though the resolution of the data is low, with only the structure with NPG better than 3Å. However, it is a bit difficult to understand what novel information is being brought out here and what is known about the ligands. For instance, are these molecules transported by the protein or do they just bind? They measure the affinity by ITC, but draw very few conclusions about how the affinity correlates with the binding modes. Can the protein transport the trisaccharide raffinose?

      The four structures with a bound sugar of different sizes aimed to identify the binding motif on both the primary substrate (sugar) and the transporter (MelB<sub>St</sub>). Although the resolutions of the structures complexed with melibiose, raffinose, or a-MG are relatively low, the size and shape of the densities at each structure are consistent with the corresponding sugar molecules, which provide valuable data for determining the pose of the bound sugar. Additionally, there is another a-NPG-bound structure at a higher resolution of 2.7 Å. Therefore, our new data support the published binding site with the galactosyl moiety as the main interacting group. The identified water-1 in this study further confirms the orientation of C4-OH. Notably, this transporter does not recognize or transport glucosides where the orientation of C4-OH at the glucopyranosyl ring is opposite. We will provide stronger data to support the water-1.

      Regarding the raffinose question, we should have clearly introduced the historical background. Bacterial disaccharide transporters have broad specificity, allowing them to work on a group of sugars with shared structural elements; for example, one sugar molecule can be transported by several transporters. As reported in the literature, the galactosides melibiose, lactose, and raffinose can be transported by both LacY and MelB of E. coli. We did not test whether MelB<sub>St</sub> can transport the a-NPG and raffinose. To address this issue and strengthen our conclusions, we plan to conduct additional experiments to gather evidence of the translocation of these sugars by MelB<sub>St</sub>.

      The HDX also appears to be well done; however, in the manuscript as written, it is difficult to understand how this relates to the overall mechanism of the protein and the conformational changes that the protein undergoes.

      Previously, we used HDX-MS to examine the conformational transition between inward- and outward-facing conformations using a conformation-specific nanobody to trap MelB<sub>St</sub> in an inward-facing state, as structurally resolved by cryoEM single-particle analysis and published in eLife 2024. That study identified dynamic regions that may be involved in the conformational transitions; however, there was no sugar present. We also solved and published the crystal structure of the apo D59C MelB<sub>St</sub>. The sugar-bound and apo states are virtually identical. To address the positive cooperativity of binding between the sugar and co-transport cations observed in biophysical analysis, in this study, we utilize HDX-MS to analyze the structural dynamics induced by melibiose, Na<sup>+</sup>, or both, focusing on the binding residues at the sugar-binding and cation-binding pockets. The results suggest that the coupling cation stabilizes sugar-binding residues at helices I and V, contributing to affinity but not specificity.

      Since MelB<sub>St</sub> favors the outward-facing conformation, and simulations on the free-energy landscape suggest that the highest affinity of the sugar-bound state is also at an outward-facing state, MelB<sub>St</sub> in both the apo and bound states tend to remain in the outward-facing conformation. We will include a section comparing these differences. Thank you to this reviewer for the critical insight.

      Reviewer #2:

      This manuscript from Hariharan, Shi, Viner, and Guan present x-ray crystallographic structures of membrane protein MelB and HDX-MS analysis of ligand-induced dynamics. This work improves on the resolution of previously published structures, introduces further sugar-bound structures, and utilises HDX to explore in further depth the previously observed positive cooperatively to cotransported cation Na<sup>+</sup>. The work presented here builds on years of previous study and adds substantial new details into how Na<sup>+</sup> binding facilitates melibiose binding and deepens the fundamental understanding of the molecular basis underlying the symport mechanism of cation-coupled transporters. However, the presentation of the data lacks clarity, and in particular, the HDX-MS data interpretation requires further explanation in both methodology and discussion.

      We thank this reviewer for taking the time to read our previous articles related to this manuscript.

      Comments on Crystallography and biochemical work:

      (1) It is not clear what Figure 2 is comparing. The text suggests this figure is a comparison of the lower resolution structure to the structure presented in this work; however, the figure legend does not mention which is which, and both images include a modelled water molecule that was not assigned due to poor resolution previously, as stated by the authors, in the previously generated structure. This figure should be more clearly explained.

      This figure shows a stereo view of a density map created in cross-eye style to demonstrate its quality. We will update this figure with a higher-resolution map, and the density for Wat-1 is clearly visible. This also addresses Reviewer-3’s comment regarding the map resolution.

      (2) It is slightly unclear what the ITC measurements add to this current manuscript. The authors comment that raffinose exhibiting poor binding affinity despite having more sugar units is surprising, but it is not surprising to me. No additional interactions can be mapped to these units on their structure, and while it fits into the substrate binding cavity, the extra bulk of additional sugar units is likely to reduce affinity. In fact, from their listed ITC measurements, this appears to be the trend. Additionally, the D59C mutant utilised here in structural determination is deficient in sodium/cation binding. The reported allostery of sodium-sugar binding will likely influence the sugar binding motif as represented by these structures. This is clearly represented by the authors' own ITC work. The ITC included in this work was carried out on the WT protein in the presence of Na<sup>+</sup>. The authors could benefit from clarifying how this work fits with the structural work or carrying out ITC with the D59C mutant, or additionally, in the absence of sodium.

      While raffinose and a-MG have been reported as substrates of MelB in E. coli, binding data are unavailable; additionally, for MelB<sub>St</sub>, we lack data on the binding of two of the four sugars or sugar analogs. We performed a label-free binding assay using ITC to address this concern with the WT MelB<sub>St</sub>. We will also perform the binding assay with the D59C MelB<sub>St</sub>, since sugar binding has been structurally analyzed with this mutant, as pointed out by this reviewer. Along with other new functional results, we will prepare a new Figure 1 on functional analysis, which will also address the comment regarding extra bulk at the non-galactosyl moiety with poor affinity.

      This D59C uniport mutant exhibits increased thermostability, making it a valuable tool for crystal structure determination, especially since the wild type (WT) is difficult to crystallize at high quality. Asp59 is the only site that responds to the binding of all coupling cations: Na<sup>+</sup>, Li<sup>+</sup>, or H<sup>+</sup>. Notably, this mutant selectively abolishes cation binding and cotransport. However, it still maintains intact sugar binding with slightly higher affinity and preserves the conformational transition, as demonstrated by an electroneutral transport reaction, the melibiose exchange, and fermentation assays with intact cells. Therefore, the structural data derived from this mutant are significant and offer important mechanistic insights into sugar transport. We will provide additional details during the revision.

      Comments on HDX-MS work:

      While the use of HDX-MS to deepen the understanding of ligand allostery is an elegant use of the technique, this reviewer advises the authors to refer to the Masson et al. (2019) recommendations for the HDX-MS article (https://doi.org/10.1038/s41592-019-0459-y) on how to best present this data. For example:

      All authors appreciate this reviewer’s comments and suggestions, which will be incorporated into the revision.

      (1) The Methodology includes a lipid removal step. Based on other included methods, I assumed that the HDX-MS was being carried out in detergent-solubilised protein samples. I therefore do not see the need for a lipid removal step that is usually included for bilayer reconstituted samples. I note that this methodology is the same as previously used for MelB. It should be clarified why this step was included, if it was in fact used, aka, further details on the sample preparation should be included.

      Yes, a lipid/detergent removal step was applied in this study and in previous studies and this information was clearly described in Methods.

      (2) A summary of HDX conditions and results should be given as recommended, including the mean peptide length and average redundancy per state alongside other included information such as reaction temperature, sequence coverage, etc., as prepared for previous publications from the authors, i.e., Hariharan et al., 2024.

      We will update the Table S2. Thank you.

      (3) Uptake plots per peptide for the HDX-MS data should be included as supporting information outside of the few examples given in Figure 6.

      We will prepare the plots in supplementary information.

      (4) A reference should be given to the hybrid significance testing method utilised. Additionally, as stated by Hageman and Weis (2019) (doi:10.1021/acs.analchem.9b01325), the use of P < 0.05 greatly increases the likelihood of false positive ΔD identifications. While the authors include multiple levels of significance, what they refer to as high and lower significant results, this reviewer understands that working with dynamic transporters can lead to increased data variation; a statement of why certain statistical criteria were chosen should be included, and possibly accompanied by volcano plots. The legend of Figure 6 should include what P value is meant by * and ** rather than statistically significant and highly statistically significant.

      We appreciate this comment and will cite this article on the hybrid significance method. We will include volcano plots for each dataset. We fully acknowledge that using a cutoff of P < 0.05 can increase the likelihood of false-positive identifications. However, given the complexity of the samples analyzed in this study, we believe that some important changes may have been excluded due to higher variability within the dataset. By applying multiple levels of statistical testing, we determined that P < 0.05 represents a suitable threshold for this study. The threshold values were marked in the residual plots and explained in the text. For Figure 6, we have revised it by showing the P value directly.

      (5) Line 316 states a significant difference in seen in dynamics, how is significance measured here? There is no S.D. given in Table S4. Can the authors further comment on the potential involvement in solvent accessibility and buried helices that might influence the overall dynamics outside of their role in sugar vs sodium binding? An expected low rate of exchange suggests that dynamics are likely influenced by solvent accessibility or peptide hydrophobicity? The increased dynamics at peptides covering the Na binding site on overall more dynamic helices suggests that there is no difference between the dynamics of each site.

      Table S4 was created to provide an overall view of the dynamic regions. If we understand correctly, this reviewer asked us to comment on the effect of solvent accessibility or hydrophobic regions on the overall dynamics outside the binding residues of the peptides that carry binding residues. Since the HDX rate is influenced by two linked factors: solvent accessibility and hydrogen-bonding interactions that reflect structural dynamics, poor solvent accessibility in buried regions results in low deuterium uptakes. The peptides in our dataset that include the Na<sup>+</sup>-binding site showed low HDX, likely due to poor solvent accessibility and structural stability. It is unclear what this reviewer meant by "increased dynamics at peptides covering the Na binding site on overall more dynamic helices." We do not observe increased dynamics in peptides covering Na<sup>+</sup>-binding sites.

      (6) Previously stated HDX-MS results of MelB (Hariharan et al., 2024) state that the transmembrane helices are less dynamic than polypeptide termini and loops with similar distributions across all transmembrane bundles. The previous data was obtained in the presence of sodium. Does this remove the difference in dynamics in the sugar-binding helices and the cation-binding helices? Including this comparison would support the statement that the sodium-bound MelB is more stable than the Apo state, along with the lack of deprotection observed in the differential analysis.

      Thanks for this suggestion. The previous datasets were collected in the presence of Na<sup>+</sup>. In the current study, we also have a Na-containing dataset. Both showed similar results: the multiple overlapping peptides covering the sugar-binding residues on helices I and V have higher HDX rates than those covering the Na<sup>+</sup>-binding residues, even when Na<sup>+</sup> is present in both datasets.

      (7) Have the authors considered carrying out an HDX-MS comparison between the WT and the D59C mutant? This may provide some further information on the WT structure (particularly a comparison with sugar-bound). This could be tied into a nice discussion of their structural data.

      Thanks for this suggestion. Conducting the HDX-MS comparison between the WT and the D59C mutant is certainly interesting, especially given the growing amount of structural and biochemical/biophysical data available for this mutant. However, due to limited resources, we might consider doing it later.

      (8) Have the authors considered utilising Li<sup>+</sup> to infer how cation selectivity impacts the allostery? Do they expect similar stabilisation of a higher-affinity sugar binding state with all cations?

      Thanks for this suggestion. We have demonstrated that Li<sup>+</sup> also shows positive cooperativity with melibiose through ITC binding measurements. Li<sup>+</sup> binds to MelB<sub>St</sub> with higher affinity than Na<sup>+</sup> but causes many different effects on MelB. It is worth investigating this thoroughly and individually. To address the second question, H<sup>+</sup> is a poor coupling cation with minimal impact on melibiose binding. Since its pKa is around 6.5, only a small subpopulation of MelB<sub>St</sub> is protonated at pH 7.5. The order of sugar-binding cooperativity is the highest with Na<sup>+</sup>, followed by Li<sup>+</sup> and H<sup>+</sup>.

      (9) MD of MelB suggests all transmembrane helices are reorientated during substrate translocation, yet substrate and cotransporter ligand binding only significantly impacts a small number of helices. Can the authors comment on the ensemble of states expected from each HDX experiment? The data presented here instead shows overall stabilisation of the transporter. This data can be compared to that of HDX on MFS sugar cation symporter XylE, where substrate binding induces a transition to OF state. There is no discussion of how this HDX data compares to previous MFS sugar transporter HDX. The manuscript could benefit from this comparison rather than a comparison to LacY. It is unlikely that there are universal mechanisms that can be inferred even from these model proteins. Highlighting differences instead between these transport systems provides broader insights into this protein class. Doi: 10.1021/jacs.2c06148 and 10.1038/s41467-018-06704-1.

      The sugar translocation free-energy landscape simulations showed that both helix bundles move relative to the membrane plane. That analysis aimed to clarify a hypothesis in the field—that the MFS transporter can use an asymmetric mode to transition between inward- and outward-facing states. In the case of MelB, we clearly demonstrated that both domains move and each helix bundle moves as a unit, so the labeling changes were identified only in some extramembrane loops and a few highly flexible helices. Thanks for the suggestion about comparing with XylE. We will include a discussion on it.

      (10) Additionally, the recent publication of SMFS data (by the authors: doi:10.1016/j.str.2022.11.011) states the following: "In the presence of either melibiose or a coupling Na<sup>+</sup>-cation, however, MelB increasingly populates the mechanically less stable state which shows a destabilized middle-loop C3." And "In the presence of both substrate and co-substrate, this mechanically less stable state of MelB is predominant.". It would benefit the authors to comment on these data in contrast to the HDX obtained here. Additionally, is the C3 loop covered, and does it show the destabilization suggested by these studies? HDX can provide a plethora of results that are missing from the current analysis on ligand allostery. The authors instead chose to reference CD and thermal denaturation methods as comparisons.

      Thank this reviewer for reading the single-molecule force spectroscopy (SMFS) study on MelB<sub>St</sub>. The C3 loop mentioned in this SMFS article is partially covered in the dataset Mel or Mel plus Na<sup>+</sup> vs. Apo, and more coverage is in the Na<sup>+</sup> vs. Apo. In either condition, no deprotection was detected. Two possible reasons the HDX data did not reflect the deprotection are: 1) The changes were too subtle and did not pass the statistical tests and 2) the longest labeling time point was still insufficient to detect the changes; much longer labeling times should be considered in future studies.

      Reviewer #3:

      Summary:

      The melibiose permease from Salmonella enterica serovar Typhimurium (MelB<sub>St</sub>) is a member of the Major Facilitator Superfamily (MFS). It catalyzes the symport of a galactopyranoside with Na⁺, H⁺, or Li⁺, and serves as a prototype model system for investigating cation-coupled transport mechanisms. In cation-coupled symporters, a coupling cation typically moves down its electrochemical gradient to drive the uphill transport of a primary substrate; however, the precise role and molecular contribution of the cation in substrate binding and translocation remain unclear. In a prior study, the authors showed that the binding affinity for melibiose is increased in the presence of Na<sup>+</sup> by about 8-fold, but the molecular basis for the cooperative mechanism remains unclear. The objective of this study was to better understand the allosteric coupling between the Na<sup>+</sup> and melibiose binding sites. To verify the sugar-recognition specific determinants, the authors solved the outward-facing crystal structures of a uniport mutant D59C with four sugar ligands containing different numbers of monosaccharide units (α-NPG, melibiose, raffinose, or α-MG). The structure with α-NPG bound has improved resolution (2.7 Å) compared to a previously published structure and to those with other sugars. These structures show that the specificity is clearly directed toward the galactosyl moiety. However, the increased affinity for α-NPG involves its hydrophobic phenyl group, positioned at 4 Å-distance from the phenyl group of Tyr26 forms a strong stacking interaction. Moreover, a water molecule bound to OH-4 in the structure with α-NPG was proposed to contribute to the sugar recognition and appears on the pathway between the two specificity-determining pockets. Next, the authors analyzed by hydrogen-to-deuterium exchange coupled to mass spectrometry (HDX-MS) the changes in structural dynamics of the transporter induced by melibiose, Na<sup>+</sup>, or both. The data support the conclusion that the binding of the coupling cation at a remote location stabilizes the sugar-binding residues to switch to a higher-affinity state. Therefore, the coupling cation in this symporter was proposed to be an allosteric activator.

      Strengths:

      (1) The manuscript is generally well written.

      (2) This study builds on the authors' accumulated knowledge of the melibiose permease and integrates structural and HDX-MS analyses to better understand the communication between the sodium ion and sugar binding sites. A high sequence coverage was obtained for the HDX-MS data (86-87%), which is high for a membrane protein.

      Thank this reviewer for your positive comments.

      Weaknesses:

      (1) I am not sure that the resolution of the structure (2.7 Å) is sufficiently high to unambiguously establish the presence of a water molecule bound to OH-4 of the α-NPG sugar. In Figure 2, the density for water 1 is not obvious to me, although it is indeed plausible that water mediates the interaction between OH4/OH6 and the residues Q372 and T373.

      Thanks for your comments on the resolution. We will improve the density for the Water 1.

      (2) Site-directed mutagenesis could help strengthen the conclusions of the authors. Would the mutation(s) of Q372 and/or T373 support the water hypothesis by decreasing the affinity for sugars? Mutations of Thr 121, Arg 295, combined with functional and/or HDX-MS analyses, may also help support some of the claims of the authors regarding the allosteric communication between the two substrate-binding sites.

      The authors thank this reviewer for the thoughtful suggestions. MelB<sub>St</sub> has been subjected to Cys-scanning mutagenesis (https://doi.org/10.1016/j.jbc.2021.101090). Placing a Cys residue on the hydrogen bond-donor Q372 significantly decreased the transport initial rate, accumulation, and melibiose fermentation, with little effect on protein expression, as shown in Figure 2 of this JBC paper. Although no binding data are available, the poor initial rate of transport with a similar amount of protein expressed suggested that the binding affinity is apparently decreased, supporting the role of water-1 in the binding pocket for better binding. The T373C mutant retained most activities of the WT. We will discuss the functional characterizations of these two mutants. Thanks.

      (3) The main conclusion of the authors is that the binding of the coupling cation stabilizes those dynamic sidechains in the sugar-binding pocket, leading to a high-affinity state. This is visible when comparing panels c and a from Figure S5. However, there is both increased protection (blue, near the sugar) and decreased protection in other areas (red). The latter was less commented, could the increased flexibility in these red regions facilitate the transition between inward- and outward-facing conformations?

      Thanks for this important question. We will discuss the deprotected data in the conformational transition between inward-facing and outward-facing states. The two regions, loop8-9 and loop1-2, are located in the gate area on both sides of the membrane and showed increased deuterium uptakes upon binding of melibiose plus Na<sup>+</sup>. They are likely involved in this process.

      The HDX changes induced by the different ligands were compared to the apo form (see Figure S5). It might be worth it for data presentation to also analyze the deuterium uptake difference by comparing the conditions sodium ion+melibiose vs melibiose alone. It would make the effect of Na<sup>+</sup> on the structural dynamics of the melibiose-bound transporter more visible. Similarly, the deuterium uptake difference between sodium ion+melibiose vs sodium ion alone could be analyzed too, in order to plot the effect of melibiose on the Na<sup>+</sup>-bound transporter.

      We will analyze the data as suggested by this reviewer.

      (4) For non-specialists, it would be beneficial to better introduce and explain the choice of using D59C for the structural analyses.

      As response to the reviewer #1 at page 3, “Asp59 is the only site that responds to the binding of all coupling cations: Na<sup>+</sup>, Li<sup>+</sup>, or H<sup>+</sup>. Notably, this mutant selectively abolishes cation binding and cotransport. However, it still maintains intact sugar binding with slightly higher affinity and preserves the conformational transition, as demonstrated by an electroneutral transport reaction, the melibiose exchange, and fermentation assays with intact cells. Therefore, the structural data derived from this mutant are significant and offer important mechanistic insights into sugar transport. We will provide additional details during the revision.”.

      (5) In Figure 5a, deuterium changes are plotted as a function of peptide ID number. It is hardly informative without making it clearer which regions it corresponds to. Only one peptide is indicated (213-226), I would recommend indicating more of them in areas where deuterium changes are substantial.

      We appreciate this comment, which will make the plots more meaningful. In the previous article published in eLife (2024), we drew boxed to mark the transmembrane regions; however, it generated much confusion, such as why some helices are very short. The revised figure will label the full length of covered positions.

      (6) From prior work of the authors, melibiose binding also substantially increases the affinity of the sodium ion. Can the authors interpret this observation based on the HDX data?

      This is an intriguing mechanistic question. Based on current data, we believe that the bound melibiose physically prevents the release of Na<sup>+</sup> or Li<sup>+</sup> from the cation-binding pocket. The cation-binding pocket and surrounding regions, including the sugar-binding residue Asp124, show low HDX, supporting this idea. Since we lack a structure with both substrates bound, figuring out the details structurally is challenging. However, we have a hypothesis about the intracellular Na<sup>+</sup> release as proposed in the 2024 JBC paper (https://doi.org/10.1016/j.jbc.2024.107427). After sugar release, the rotamer change of Asp55 will help Na<sup>+</sup> exit the cation pocket to the sugar pocket, and the negative membrane potential will facilitate the further movement from MelB to the cytosol. We will discuss this during the revision.

    1. Everyday ageism was associated with poor physical and mental health across all 4 outcomesexamined (Table 2 and Figure 3). For each additional point on the Everyday Ageism Scale, odds of fairor poor physical health increased by 1.13-fold (95% CI, 1.01-1.17; P < .001) after adjusting forsociodemographic characteristics. The probability of fair or poor physical health was 0.082 for adultsreporting everyday ageism 1 SD below the mean. This increased to 0.134 (63.4%) for those reportingmean levels of everyday ageism and 0.213 for those 1 SD above the mean (for an increase inprobability of 59.0% vs the mean) (Table 2 and Figure 3A). Everyday ageism was associated withnumber of chronic health conditions

      I thought this paragraph was very interesting because it explains how ageism directly affects older adults health. It stood out to me because it showed that as there is an increase in ageism, there is also an increase in the probability of pore physical health as well as poor mental health. I think part of this has to due with stereotypes of older adults having little independence and people not allowing them to complete tasks on their own.

    2. Everyday ageism differed by indicators of lowersocioeconomic status (ie, education, income, and employment). Adults in rural areas (11.07 [95% CI,10.45-11.68]) reported more everyday ageism than those in metropolitan areas (10.06 [95% CI,9.40-10.72]; P = .003), as did those in the Midwest (10.57 [95% CI, 9.90-11.23]) compared with thosein the Northeast (9.68 [95% CI, 9.18-10.18; P = .006). Adults spending more than 4 hours daily onmedia reported more everyday ageism (11.03 [95% CI, 10.47-11.60]) than those with less media use(2-4 hours: 10.35 [95% CI, 9.83-10.87]; P = .004; <2 hours: 9.21 [95% CI, 8.81-9.61]; P < .001)

      I liked how this section was incorporated since there is not one specific reason for everyday ageism. I also like how it shows the difference in how the use of social media can change someone. This can help people caring for this population to understand factors that have gone into everyday ageism.

    1. USKORENIE, GLASNOST' AND PERESTROIKA: THE PATTERN O

      Group 1 has no summary posted

      Group 2: Gorbachev’s Non-Violent Revolution. Key points: Perestroika was about empowering the people and breaking away from totalitarian power. Gorbachev had a very humanistic approach that was against violence His attitude towards women was of true respect There was a lack of a clear plan from Gorbachev in the beginning

      Conclusions: How and why was he able to gain power under communism? Gorbachev’s goal was to reform the system in which the people had continuously been oppressed by, and his reform was called Perestroika. His upbringing was full of hardships and famine, making him more understanding towards the people in his policies and shaping the way he took on his non-violent revolution and New Thinking.

      Group 4 has no summary posted

      Group 5: Linking Gorbachev’s Domestic and Foreign Policies Key points: The Soviet Union has a neutral position regarding the conflict between ex-Union countries. Gorbachev’s policies are seen as “ambitious” by the author of the article There was an anti-alcohol campaign that failed Gorbachev’s international goals are focused on improving relations with countries like the US, UK and Israel, which was unusual in Russian policies.

      Conclusions: Gorbachev’s domestic reforms are linked to his foreign policy actions during this crucial time in Soviet history. The ongoing tension between Armenia and Azerbaijan, both of which used to be part of the Union, bring the Soviet Union at a neutral state, and shifting its attention towards foreign relations, driven by the realization that China was no longer a dependable ally, so there was a need to form new allays and have stability.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03094

      Corresponding author(s): Saurabh S. Kulkarni

      1. General Statements

      We thank the reviewers for their strong praise of the manuscript, highlighting its rigor, depth, and conceptual importance. They consistently described the study as a beautiful, fascinating, and conceptually strong piece of work that addresses a timely question in multiciliated cells. They also noted the high quality of the data, careful quantification, and the use of multiple genetic and pharmacological approaches, all of which improve the reproducibility and credibility of the findings. Importantly, they emphasized the novelty of discovering a direct mechanistic link between Piezo1-mediated mechanotransduction and Foxj1-driven transcriptional control of multiciliation, representing a significant breakthrough for both the cilia field and mechanobiology more broadly. Collectively, these strengths highlight the manuscript’s wide impact and make it highly suitable for publication in a high-impact journal.

      2. Description of the planned revisions

      Reviewer #1:


      There are two experiments that would significantly strengthen these claims.

      • First if their model is correct then even short term treatment with Yoda1 should induce the pathway and effect centriole numbers. While I appreciate the challenge of long term Yoda1 treatment its not clear to me why it would be needed if short term treatment is setting off the transcriptional cascade. Yoda is used throughout the paper to induce all the pathways but we don't know if it actually induces the phenotype. I think this should be addressed with either short term treatments or a dose response to find a dose that does not lead to skin pealing. It is hard to ignore this obvious deficiency.
      • Second, the model predicts that all of this is to regulate Foxj1 levels to regulate the subtle balance between cell size and centriole number. If this is correct, then the overexpression of Foxj1 should have a profound effect on centriole number in multiciliated cells. This is such an easy experiment that would validate many of the claims. RESPONSE:

      We recognize that the reviewer is asking us to test the sufficiency of the pathway with these comments: “If their model is correct, then they should be able to activate the pathway in one way or another to stimulate centriole number. This is a significant limitation to their overall model.” And “If this is correct, then the overexpression of Foxj1 should have a profound effect on centriole number in multiciliated cells.”

      To address reviewers’ suggestions, we will perform the following experiments.

      1. A brief exposure (15 and 30 mins) to Yoda1 and wait for 3 hours to examine changes in centriole amplification. This will avoid skin peeling from long-term exposure.
      2. A brief exposure to Yoda1 (15 mins) followed by a 30-minute wait period, and the cycle repeats a total of 4 times for a total of 3 hours to examine centriole amplification.
      3. The above two experiments will also be done in a constitutively active-Yap background to increase the probability that synergistic activation can lead to centriole amplification.
      4. Although Foxj1 is essential for multiciliogenesis, it is not sufficient to induce multiciliogenesis, as shown by multiple previous studies. Therefore, we do not expect overexpression of Foxj1 to have a profound effect on centriole number. While we will conduct the experiments because we truly want to address the suggestions and gain insight into the answers ourselves, we respectfully ask the Reviewer to consider the following responses to their concern.

      Yoda1 sufficiency: We agree that testing whether acute Yoda1 treatment can induce centriole amplification is an important question. We will conduct experiments with short-pulse and cyclic Yoda1 exposure, including in a constitutively active-YAP background (listed above), to address this possibility. However, several challenges complicate interpretation: (i) PIEZO1 adapts and desensitizes upon activation, (ii) transient signaling may be sufficient to cause secondary signaling but insufficient to drive stable transcriptional programs required for amplification, and (iii) centriole number is inherently variable, making modest effects difficult to resolve. However, we must recognize that failure to observe sufficiency under these conditions would not invalidate the model for two reasons: 1) absence of evidence is not evidence of absence, and thus, we may not have found the right experimental design. 2) PIEZO1–YAP is a necessary input but not sufficient on its own, as elaborated below. For both reasons, we are very careful about the interpretation of results in the manuscript, which shows that this pathway is necessary for centriole amplification using loss-of-function approaches.

      Foxj1 overexpression: Foxj1 is a well-established regulator essential for motile and multiciliogenesis across species (Xenopus, zebrafish, mouse). Loss of Foxj1 reduces cilia number in MCCs, but its activation alone does not have a profound effect on ciliogenesis/cilia number in MCCs. This is because Foxj1 is a part of a larger network essential for multiciliogenesis. This parallels the behavior of other transcriptional regulators, such as Myb, where loss of function impairs centriole amplification, but overexpression does not drive the formation of supernumerary centrioles. Both studies are seminal discoveries in the field of ciliogenesis, but they did not demonstrate the sufficiency of these molecules/pathways. Thus, our results, demonstrating that Foxj1 is necessary to induce tension-dependent centriole amplification, are significant, as the reviewer mentioned. The lack of Foxj1 sufficiency to induce centriole amplification is not a deficiency of the study, but rather evidence that Foxj1 is a part of a larger network essential for tension-dependent centriole amplification.

      Necessity versus sufficiency: We respectfully emphasize that sufficiency is not a prerequisite for demonstrating the significance of a pathway. Mechanochemical signaling is inherently complex, involving many mechanosensitive proteins and pathways. In our case, mechanical stretch increases centriole amplification, with PIEZO1–YAP signaling identified as a key mediator. However, we do not claim that PIEZO1–YAP alone is sufficient. Other pathways, including cadherin-mediated junctions, F-actin–myosin contractility, integrin–focal adhesion signaling, and nuclear mechanotransduction, likely contribute and may regulate unique downstream effectors that collectively promote centriole amplification. Therefore, PIEZO1–YAP should be regarded as one essential component within a larger network.


      __TIMELINE: __We will perform these additional proposed experiments. Since the first author, a postdoctoral researcher on this manuscript, has started a new job and will be coming in on weekends to complete the experiments, we estimate it will take approximately 2-3 months to finish them.


      Reviewer #2:

      1. Considering the Yap-piezo mechanism of action, the authors' logic for the selection of myb, foxj, plk4 and ccno as transcriptional targets is clear, but the HCR-derived signal and the differences seen in the yap morphants are not very strong, notwithstanding the statistical significance. There appear to be distinct subgroups within the treated populations (in Figure S6B, although these data seem quite different in Fig. 7H, so a comment on the technical differences might be helpful), so that the extent to which Yap1 regulates (Myb-)Foxj1 expression in MCCs is not clearly demonstrated by this experiment. Related to this point, it is unclear why 20-25% of the yap1/ piezo1 MO-treated embryos do not show a decline in FOXj1 in Fig. 6, given the qualitative nature of the scoring. Assuming the KD penetrance would vary on a cell-to-cell basis, rather than an embryo-to-embryo basis, this may suggest that there are additional relevant targets (some of which are discussed by the authors). Single-cell analysis might be a way to address this; however, this is not a trivial experiment, it might be sufficient to include a caveat in the text. Furthermore, the conclusion that Foxj1 regulates centriole amplification in a tension-dependent manner is well-supported by the data.

      RESPONSE: We appreciate the reviewer’s thoughtful observation. Differences in the expression of Foxj1 from experiment to experiment are possible due to a combination of factors, including heterogeneity in MCC development across embryos, slightly different embryonic stages, differences in embryo quality between fertilizations, and variability in morpholino delivery and knockdown penetrance, which can occur both across embryos and on a cell-to-cell basis within an embryo. We also note that technical aspects of HCR RNA-FISH, such as proteinase K treatment and washing steps, can affect signal intensity, potentially contributing to the appearance of distinct subgroups within treated populations.

      We agree that single-cell analysis would be a powerful way to dissect these differences, but as the reviewer notes, this is not a trivial experiment and is beyond the scope of the present study. We have therefore added clarifications in the text and discussion to acknowledge these sources of variability and to highlight the possibility of parallel pathways regulating foxj1 expression.

      ********************************************

      Controls for the knockdowns by the various MOs should be provided.

      RESPONSE: We appreciate the reviewer’s comment. The piezo1 MO has been previously established in Kulkarni et al. (2021). Additionally, the current manuscript includes MO control experiments for both erk2 and yap1, through KD at the 1-cell stage using the MO oligonucleotide, followed by mosaic-rescue with the respective WT RNA constructs (mCherry-ERK2 and yap1-GFP) and a nuclear tracer molecule such as H2B-RFP (Fig. 5, E-H, Fig. S5, C&D, Fig. 3, D-F). The mosaic-rescue is a robust experiment that provides an internal control within the same embryo, thereby avoiding differences that may arise due to embryo-to-embryo variability, embryo quality, or differences in fertilization batches. This approach also serves as a valuable tool for detecting cell-autonomous effects, providing a clear readout against uninjected neighboring cells, as the injected cells are labeled with a tracer. We will perform a similar mosaic-rescue experiment for the foxj1 MO.

      TIMELINE: We will conduct mosaic-rescue experiments for the foxj1 MO. We will need 1 month to complete the experiment.

      ********************************************

      __Minor comments:

      __

      Autocorrection of ERK1/2 or MEK1/2 pathways to 1/2 should be avoided. – We are unclear on this comment. Can reviewer please clarify what they mean.


      Reviewer # 3

      Major concerns

      1- The presented data do not yet establish a specific, direct pathway linking mechanotransduction to centriole number, because the molecular players tested (PIEZO1, Ca²⁺, PKC, ERK, YAP, Foxj1) are highly pleiotropic. As such, the observed centriole number phenotypes, and some of the major conclusions, could be indirect. It is therefore critical to test the specificity and causality of the proposed pathway. This could be done with the authors' own strategies and/or with the following potential approaches:

      • Genetic dependency and sufficiency tests: It could be shown that Yoda1 has no effect in PIEZO1 loss-of-function MCCs, and that wild-type PIEZO1, but not conductance-ad PIEZO1 pore mutants restores Yoda1 responsiveness across centriole number, pERK, and YAP readouts. For example, PIEZO1 C terminus was shown to govern Ca²⁺ influx and ERK1/2 activation. Comparing full length PIEZO1 with a C terminal deletion in MCC restricted rescue; loss of rescue of centriole amplification and ERK/YAP activation with the C terminal deletion can provide a genetics anchored specificity test beyond broad inhibitors.

      RESPONSE:

      • To address the reviewer’s concern, we will test whether Yoda1 affects ERK and Yap activation when Piezo1 is depleted. We appreciate the reviewer’s thoughtful suggestion to employ genetic rescue experiments with Piezo1 mutants. Unfortunately, these are not technically feasible in Xenopus, as the Piezo1 coding sequence is exceptionally large (~7.5 kb)____, and repeated attempts by our group to generate and express stable, translatable transcripts have been unsuccessful. To address genetic dependency and specificity despite these technical barriers, we have employed a combination of orthogonal strategies that together provide strong genetic and mechanistic evidence:

      • Mosaic loss-of-function experiments (Fig. 1) demonstrate that Piezo1 regulates centriole number in a cell-autonomous manner, ruling out global epithelial or indirect tissue-wide effects.

      • Pharmacological activation/inhibition with Piezo1-specific agonist (Yoda1) and inhibitors (GSMTx4, gadolinium) produced consistent phenotypes, including activation of downstream ERK and YAP readouts. Notably, Yoda1 is a Piezo-specific agonist, not a broad pharmacological agent.
      • Downstream pathway dissection (calcium chelation, PKC inhibition, ERK2 depletion, and YAP1 knockdown/rescue) consistently converges on the same phenotypes, reduced centriole amplification and altered Foxj1 expression, providing multiple independent lines of evidence that the Piezo1–Ca²⁺–PKC–ERK–YAP axis specifically controls centriole number.
      • Positive feedback regulation of Piezo1 expression by YAP/Foxj1 (Fig. 7) further strengthens the argument for a pathway-specific role rather than pleiotropic, indirect effects. Taken together, while full-length Piezo1 rescue experiments are technically not possible in Xenopus due to gene size constraints, our data employ state-of-the-art genetic, pharmacological, and orthogonal functional assays to rigorously test pathway specificity. These complementary approaches provide compelling evidence for the causal role of Piezo1-mediated mechanotransduction in centriole number control in MCCs.

      • Downstream bypass/rescue experiments: In PIEZO1 loss-of-function or BAPTA conditions, can enforcing MEK/ERK activation or YAP rescue centriole number defect? Conversely, can MEK inhibitors block Yoda1-induced effects.

      RESPONSE: We appreciate the reviewer’s insightful questions.

      • We will express CA Yap in the Piezo1 KD background to assess if we can rescue centriole number. We also note that the converse experiment has already been performed in our study: 1) PKC inhibition abolishes Yoda1-induced ERK phosphorylation and nuclear localization (Fig. 2), 2) both MEK inhibition and ERK2 depletion block Yoda1-induced Yap activation and nuclear entry (Figs. 4, S2). Thus, we have directly demonstrated that MEK inhibition prevents Yoda1-induced effects, satisfying this aspect of the reviewer’s concern.

      ********************************************

      2- Image quantification and analysis must be described in greater detail in the Methods section, as they are central to the major conclusions of the manuscript. For example, the authors should explain how nuclear, cytoplasmic, and centriole segmentation were performed, and how relative protein levels in the nucleus versus the cytoplasm (e.g., YAP, volume- or area-based) were quantified. Specifically, the thresholds and segmentation criteria applied to different cellular structures under various conditions, as well as the use of Imaris and other software, should be clearly detailed.

      RESPONSE: We will describe the methods in greater detail.

      ********************************************

      3- PIEZO1 mRNA was shown to incrase in a Foxj1 linked feedback loop. Does this increase translate into an increase in total protein levels?

      RESPONSE: If the reviewer is referring to Figure 7B, that is the Piezo1 antibody, so yes, the Piezo1 protein levels have increased.

      If the reviewer is referring to Figure 7C and D, we show that loss of Foxj1 leads to a reduction in Piezo1 mRNA expression.

      ********************************************

      4- Is the proposed signaling cascade active in mammalian multiciliated cells (e.g., airway epithelium). If possible, testing this by using one of the major players of the pathway as a readout such as as ERK phosphorylation, YAP nuclear localization in mammalian MCCs will reveal whether regulation of centriole number through this pathway is conserved and would strengthen the generality.


      RESPONSE: We agree with the reviewer that testing conservation of this pathway in mammalian MCCs is of great interest. Indeed, another group is currently investigating the role of Yap in the mammalian airway epithelium; in their temporally controlled Yap knockout model (the global Yap KO being embryonic lethal), they observed that Yap loss led to a reduction in centriole number. To avoid overlap and direct competition with this ongoing work, we chose to focus our efforts on Xenopus.

      Importantly, Xenopus has become a widely recognized and powerful system for MCC biology, enabling mechanistic dissection of centriole amplification and ciliogenesis. Several key discoveries in the field, including the identification of MCIDAS as a master regulator of MCC fate, were first made in Xenopus before being validated in mammals. Similarly, our study provides a mechanistic framework in Xenopus that can inform and guide ongoing studies in the mammalian airway.

      ********************************************

      5- Throughout the results section, there are multiple times where authors raised specific hypothesis about their data (e.g. foxj1 regulation of number control, apical actin/YAP). However, they have not tested them. These hypothesis are very exciting and if possible, testing experimentally, would strengthen the conclusions associated with them.

      RESPONSE: We are not sure what the reviewer means here by “authors raised specific hypothesis about their data (e.g., foxj1 regulation of number control, apical actin/YAP). However, they have not tested them”,

      BECAUSE:

      • Foxj1 regulation of centriole number: We demonstrate a clear reduction in centriole number upon Foxj1 depletion, and importantly, we extend this finding by showing that the reduction is tension-dependent (Fig. 6). We will perform a rescue assay to demonstrate the specificity.
      • Foxj1 and YAP: We never claimed that Foxj1 regulates YAP expression, and this is not part of our proposed model. Instead, our data show that Piezo1–ERK–YAP signaling regulates Foxj1
      • Foxj1 and apical actin: Foxj1 regulation of apical F-actin has already been established in prior work, and in our study, we clearly observe reduced apical actin intensity in Foxj1-depleted MCCs (Fig. 6). To further strengthen this conclusion, we will provide a quantitative analysis of apical actin intensity in Foxj1 morphants. ********************************************

      __TIMELINE: __We will perform these additional proposed experiments. Since the first author, a postdoc on this manuscript, has started a new job and will be coming in on weekends to finish the experiments, we estimate it will take approximately 2-3 months to complete them.

      Minor comments

      MCC vs non MCC identification (Fig. 1): Clarify how non MCCs were distinguished from MCCs (e.g. markers/criteria). – Can the reviewer please clarify which panel or panels? Or provide more specific text that needs to be changed.

      Add the Kintner group reference linking motile cilia number and centriole number in Xenopus MCCs.– Can the reviewer clarify where and which reference? Thank you.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer 2

      Major comments:

      1. It should be clarified whether the immunoblots and the related quantitations in Figs. 2 and S2 are all from separate blots/ exposures. If so, they are not useful as controls, and these blots should be repeated with the relevant samples analyzed in parallel. Size markers and labels should be included (2B, 2G; S2B and S2G). An increase in total ERK would alter the interpretation of the increase in nuclear pERK in the IF experiments. RESPONSE: We thank the reviewer for raising this important point regarding clarification of the immunoblots. All experimental groups were analyzed in parallel with their corresponding controls. Because the primary antibodies for pERK and ERK were both raised in rabbit, we optimized our workflow to prevent protein loss during stripping and to ensure accurate visualization. Specifically, lysates from each experimental group were loaded in duplicate on the same gel, separated by a molecular weight ladder that served as a reference point. After transfer, the blot was cut along the ladder, and the two halves were processed in parallel: one probed with anti-pERK and the other with anti-ERK. This strategy ensured that all samples from a single experiment (e.g., Control and Yoda1-treated groups) were analyzed under identical conditions, with staining and imaging performed together at the same exposure. To enhance clarity, we have provided this data as __uncut, full-length __as Supplemental Figure 7 (Figure S7) in the revised revision.

      ********************************************

      Minor comments:

      1. Reference list should be checked for completeness; some citations lack journal/ volume/ page/ year details. – We have corrected the references.
      2. An 'overexposed' version of the image selected for centrioles in Figure 5F might be included with the Chibby-BFP at the same level as in the other figures. At present, the Yap KD cell in the image appears to have normal centrioles; this is potentially confusing, even though the authors clearly explain the matter in the text. – __We have added a new panel to Fig. 5F to avoid confusion.

      __ 3. It might be clearer to present injected/ uninjected in the same orientation in Fig. 6A and B. – __Unfortunately, that is not possible because the injected and uninjected sides are left and right, and they cannot be in the same orientation.

      __ 4. Figure 7B lacks the schematic described in the figure legend. – We have removed the Schematic sentence from the figure legend. That was an error on our side. Thank you for catching it.


      Reviewer 3


      1. Abstract: "how MCCs regulate centriole/cilia numbers remains a major knowledge gap" overstates the field; please soften to reflect recent advances (mechanics/apical area scaling; PIEZO1 implication). – We changed the text to “incompletely understood”.
      2. GsMTx4 rationale: State that GsMTx4 is a spider venom peptide that inhibits cationic mechanosensitive channels (including PIEZO1) and justify its use alongside Yoda1.– GsMTx4 was used in the previous manuscript, and its use was justified there. Here, we are only comparing the results. However, we have added a sentence describing what GSMTx4 is. We have also included a sentence explaining the use of Yoda1. “GsMTx4, a spider venom peptide used in our previous study, inhibits cationic mechanosensitive channels, including Piezo1.”

      “For this experiment, we used the Piezo1 channel-specific chemical agonist, Yoda1, to increase the sensitivity of Piezo1 and upregulate calcium entry into cells”

      Timeline statement: "Centriole amplification to migration and apical docking takes ~4-5 h (personal observation)" is not appropriate; either cite time lapse literature or include your own time lapse data.– We have added a reference that showed imaging for 2 hours, but it was not enough to capture the entire process from intercalation to maturation, so we also kept “personal observation” still in the manuscript. We are unaware of any study that has done time-lapse imaging for 4 hours to capture the entire process of centriole amplification.

      Redundancy: The description of Yoda1 as a channel specific agonist is repeated; keep only once.- Removed

      "WT yap1 GFP construct previously used by Dr. Lance Davidson ..." should move construct description to Methods and keep only the citation in Results.– We moved it to Methods.

      "(Unpublished data; Dr. Mahjoub)" should be removed unless data are shown.- Removed

      Replace "as shown previously in our eLife paper" with "as we previously showed or shown previously (Kulkarni et al., 2021)".– We have made the change.

      The two hypotheses for how Foxj1 could regulate number under tension (actin remodeling vs. transcriptional control of amplification genes) belong in the Discussion unless tested. Moreover, the part on the discussion on yap sequestration by apical actin and the two possibilities presented also should go do discussion. – We have moved both to the discussion section.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Reviewer 3

      1- The hypothesis about the centriole pool of Piezo as the mechnosensor for centriole number regulation is very exciting and novel. Can localization controlled variants be used to test whether a centriole associated pool directly senses tension for number control (for example, centrosome targeted PIEZO1 via a PACT tag). Alternatively, broad cellular Ca sensors (GcaMP) or centrosome proximal Ca sensors (e.g., PACT GCaMP) can be used detect local calcium microdomains during tethering or Yoda1 treatment.

      RESPONSE: We appreciate the reviewer's curiosity and excitement; however, these experiments will not alter the conclusion of this paper and will be part of the next study, which aims to delve deeper into how different pools of Piezo1 at centrioles versus cell junctions function in MCCs. To that point, we had thought about these experiments. As mentioned earlier, the Piezo1 coding sequence is exceptionally large (~7.5 kb)____, and repeated attempts by our group to generate and express stable, translatable transcripts have been unsuccessful. Thus, the idea of centrosome-targeted PIEZO1 via a PACT is very exciting; however, it is not technically feasible. Beyond size, PIEZO1 is a trimeric, large plasma-membrane mechanosensitive channel that requires proper ER processing and bilayer incorporation. PACT localizes cargo to the centriole/pericentriolar material, not a membrane compartment; thus, a PACT-anchored PIEZO1 would be membrane-mismatched and almost certainly nonfunctional even if expressed/

      Second, Centrosome-proximal GCaMP (PACT-GCaMP) would show correlation, not causation. This experiment does not address the question “centriole pool of Piezo as the mechanosensor for centriole number regulation”. It will only show if the Ca2+ influx is happening at the basal bodies, but not whether and how that Ca2+ is essential for centriole amplification. For this purpose, we will need to find a way to block Ca2+ influx specifically at basal bodies, rather than junctions, which will require extensive controls.

      We do not claim that any specific Piezo1 or Ca2+ pool is critical for controlling centriole number and thus the suggested experiment would not alter the manuscript's conclusions. We therefore view the above as exciting future directions rather than prerequisites.

      ********************************************

      2- Because the proposed pathway is tension-sensing and YAP pathway is tightly linked to the actin cytoskeleton, the role of actin cysoskeleton in the proposed pathway should be tested directly. The authors mention different hypothesis around actin but has not tested them in the manuscript. For example, actin-depedent sequestration of Yap at the apical surface is intriguing. Does actin polymerization induced by drugs release Yap from the apical surface?

      RESPONSE: We would like to thank the reviewer for their suggestion. As per the reviewers' suggestion, we have moved this section to discussion, stating that “In the future, we plan to address this question by examining how Yap is sequestered by apical actin.”.

      However, we appreciate the reviewer’s enthusiasm and would like to share some experiments we are thinking/planning of to test the hypothesis.

      We plan to examine if the actin polymerization or contractility is responsible for Yap sequestration/release from the apical surface with the following experiments: 1) if the Yap is displaced by Jasplakinolide treatment, which stabilizes filamentous actin, 2) use of ROCK inhibitor to decrease contractility in the absence or presence of Yoda1, 3) Use genetic constructs such as Shroom3 to increase ROCK-mediated contractility to observe changes in Yap localization and dynamics.

      Although these experiments are interesting, they do not alter the conclusion of the current manuscript, and they represent future directions for our research.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript investigates how mechanical tension is transduced into centriole amplification in Xenopus multiciliated cells (MCCs). Building on prior work that centriole number scales with MCC apical area and that this scaling depends on PIEZO1, the study proposes that MCCs repurpose a canonical mechanochemical axis-PIEZO1 → Ca²⁺/PKC → ERK1/2 → YAP → Foxj1-to regulate centriole number rather than mitosis. The authors use tethered vs. untetheredanimal cap explants to modulate tissue tension, combine pharmacologic perturbations with genetic loss of function and rescue, quantititative image analysis and present a model in which tension gated PIEZO1 activates ERK/YAP, influences Foxj1, and tunes centriole number in MCCs.

      The manuscript tackles an important and timely problem with clear disease relevance. It major advance is their presented model that posits that post mitotic MCCs repurpose a canonical mechanotransduction module to regulate organelle number rather than proliferation. It is a conceptually strong study addressing an important problem with a clean mechanical paradigm. However, to support the central claim that centriole number control is a specific, direct consequence of the PIEZO1-Ca²⁺-ERK/YAP pathway within MCCs, the revision should establish specificity and causality and provide experimental data for some of the major conclusions as detailed below. Addressing these points are critical to support the mechanistic conclusions and impact.

      Major concerns:

      1) The presented data do not yet establish a specific, direct pathway linking mechanotransduction to centriole number, because the molecular players tested (PIEZO1, Ca²⁺, PKC, ERK, YAP, Foxj1) are highly pleiotropic. As such, the observed centriole number phenotypes, and some of the major conclusions, could be indirect. It is therefore critical to test the specificity and causality of the proposed pathway. This could be done with the authors' own strategies and/or with the following potential approaches:

      • Genetic dependency and sufficiency tests: It could be shown that Yoda1 has no effect in PIEZO1 loss-of-function MCCs, and that wild-type PIEZO1, but not conductance-dead PIEZO1 pore mutants restores Yoda1 responsiveness across centriole number, pERK, and YAP readouts. For example, PIEZO1 C terminus was shown to govern Ca²⁺ influx and ERK1/2 activation. Comparing full length PIEZO1 with a C terminal deletion in MCC restricted rescue; loss of rescue of centriole amplification and ERK/YAP activation with the C terminal deletion can provide a genetics anchored specificity test beyond broad inhibitors.

      • Downstream bypass/rescue experiments: In PIEZO1 loss-of-function or BAPTA conditions, can enforcing MEK/ERK activation or YAP rescue centriole number defect? Conversely, can MEK inhibitors block Yoda1-induced effects.

      2) The hypothesis about the centriole pool of Piezo as the mechnosensor for centriole number regulation is very exciting and novel. Can localization controlled variants be used to test whether a centriole associated pool directly senses tension for number control (for example, centrosome targeted PIEZO1 via a PACT tag). Alternatively, broad cellular Ca sensors (GcaMP) or centrosome proximal Ca sensors (e.g., PACT GCaMP) can be used detect local calcium microdomains during tethering or Yoda1 treatment.

      3) Because the proposed pathway is tension-sensing and YAP pathway is tightly linked to the actin cytoskeleton, the role of actin cysoskeleton in the proposed pathway should be tested directly. The authors mention different hypothesis around actin but has not tested them in the manuscript. For example, actin-depedent sequestration of Yap at the apical surface is intriguing. Does actin polymerization induced by drugs release Yap from the apical surface?

      4) Image quantification and analysis must be described in greater detail in the Methods section, as they are central to the major conclusions of the manuscript. For example, the authors should explain how nuclear, cytoplasmic, and centriole segmentation were performed, and how relative protein levels in the nucleus versus the cytoplasm (e.g., YAP, volume- or area-based) were quantified. Specifically, the thresholds and segmentation criteria applied to different cellular structures under various conditions, as well as the use of Imaris and other software, should be clearly detailed.

      5) PIEZO1 mRNA was shown to incrase in a Foxj1 linked feedback loop. Does this increase translate into an increase in total protein levels?

      6) Is the proposed signaling cascade active in mammalian multiciliated cells (e.g., airway epithelium). If possible, testing this by using one of the major players of the pathway as a readout such as as ERK phosphorylation, YAP nuclear localization in mammalian MCCs will reveal whether regulation of centriole number through this pathway is conserved and would strengthen the generality.

      7) Throughout the results section, there are multiple times where authors raised specific hypothesis about their data (e.g. foxj1 regulation of number control, apical actin/YAP). However, they have not tested them. These hypothesis are very exciting and if possible, testing experimentally, would strengthen the conclusions associated with them.

      Minor concerns:

      1) Abstract: "how MCCs regulate centriole/cilia numbers remains a major knowledge gap" overstates the field; please soften to reflect recent advances (mechanics/apical area scaling; PIEZO1 implication).

      2) MCC vs non MCC identification (Fig. 1): Clarify how non MCCs were distinguished from MCCs (e.g. markers/criteria).

      3) GsMTx4 rationale: State that GsMTx4 is a spider venom peptide that inhibits cationic mechanosensitive channels (including PIEZO1) and justify its use alongside Yoda1.

      4) Timeline statement: "Centriole amplification to migration and apical docking takes ~4-5 h (personal observation)" is not appropriate; either cite time lapse literature or include your own time lapse data.

      5) Redundancy: The description of Yoda1 as a channel specific agonist is repeated; keep only once.

      6) "WT yap1 GFP construct previously used by Dr. Lance Davidson ..." should move construct description to Methods and keep only the citation in Results.

      7) "(Unpublished data; Dr. Mahjoub)" should be removed unless data are shown.

      8) Add the Kintner group reference linking motile cilia number and centriole number in Xenopus MCCs.

      9) Replace "as shown previously in our eLife paper" with "as we previously showed or shown previously (Kulkarni et al., 2021)".

      10) The two hypotheses for how Foxj1 could regulate number under tension (actin remodeling vs. transcriptional control of amplification genes) belong in the Discussion unless tested. Moreover, the part on the discussion on yap sequestration by apical actin and the two possibilities presented also should go do discussion.

      Significance

      This manuscirpt dissects Piezo1-mediated mechanotransduction to regulation of centriole number in Xenopus multiciliated cells (MCCs) via Ca²⁺, ERK/YAP, and Foxj1. While Piezo1 and its downstream effectors have been implicated broadly in mechanosensation, cellular tension responses, and transcriptional regulation, their specific role in centriole nubmer control in MCCs has been unknown By integrating pharmacological manipulation, genetic perturbation, and functional readouts, the authors demonstrate that this pathway directly influences centriole number.

      The findings extend published knowledge in two main ways:

      (1) they connect a mechanosensitive ion channel to the transcriptional program governing Foxj1 expression and multiciliation, a mechanistic link not previously defined, and

      (2) they highlight the pleiotropic yet coordinated nature of Piezo1 signaling in organelle biogenesis. This work will be of broad interest to cell and developmental biologists studying ciliogenesis, epithelial differentiation, and mechanotransduction, as well as to biomedical researchers interested in multicilaited cells and ciliopathies. By situating a well-studied mechanosensor within the context of MCC biology, the study opens new directions for understanding how tissue-level forces shape organelle number control and function.

      At the same time, the impact of the study is weakened by concerns regarding the causability and specificity of the pathway, since the signaling components examined are highly pleiotropic and it remains challenging to separate direct effects on centriole number from broader cellular consequences. The causal relationships among Piezo1 activity, downstream signaling, and Foxj1 expression require stronger substantiation, and the extent to which this pathway operates in mammalian multiciliated cells remains an open question. Addressing these limitations would strengthen the robustness, generality, and translational relevance of the conclusions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Narayanan, Kulkami and colleagues here examine how the Piezo-Erk-Yap pathway is involved in centriole numerical control in multiciliated cells (MCCs). Using reverse genetic and pharmacological methods in Xenopus embryos, they show that Piezo-mediated ERK signalling through to Yap regulates tension-sensitive centriole number, through a mechanism that involves Foxj1, very likely acting as a transcription factor. The data are carefully controlled, robustly analysed and well presented. Statistical analyses are notably thorough.

      Main points:

      1. It should be clarified whether the immunoblots and the related quantitations in Figs. 2 and S2 are all from separate blots/ exposures. If so, they are not useful as controls, and these blots should be repeated with the relevant samples analysed in parallel. Size markers and labels should be included (2B, 2G; S2B and S2G). An increase in total ERK would alter the interpretation of the increase in nuclear pERK in the IF experiments.

      2. Considering the Yap-piezo mechanism of action, the authors' logic for the selection of myb, foxj, plk4 and ccno as transcriptional targets is clear, but the HCR-derived signal and the differences seen in the yap morphants are not very strong, notwithstanding the statistical significance. There appear to be distinct subgroups within the treated populations (in Figure S6B, although these data seem quite different in Fig. 7H, so a comment on the technical differences might be helpful), so that the extent to which Yap1 regulates (Myb-)Foxj1 expression in MCCs is not clearly demonstrated by this experiment. Related to this point, it is unclear why 20-25% of the yap1/ piezo1 MO -treated embryos do not show a decline in FOXj1 in Fig. 6, given the qualitative nature of the scoring. Assuming the KD penetrance would vary on a cell-to-cell basis, rather than an embryo-to-embryo basis, this may suggest that there are additional relevant targets (some of which are discussed by the authors). Single-cell analysis might be a way to address this; however, this is not a trivial experiment, it might be sufficient to include a caveat in the text. Furthermore, the conclusion that Foxj1 regulates centriole amplification in a tension-dependent manner is well-supported by the data.

      3. Controls for the knockdowns by the various MOs should be provided.

      Minor points:

      1. Autocorrection of ERK1/2 or MEK1/2 pathways to 1/2 should be avoided.

      2. Reference list should be checked for completeness; some citations lack journal/ volume/ page/ year details.

      3. An 'overexposed' version of the image selected for centrioles in Figure 5F might be included with the Chibby-BFP at the same level as in the other figures. At present, the Yap KD cell in the image appears to have the normal centrioles; this is potentially confusing, even though the authors clearly explain matters in the text.

      4. It might be clearer to present injected/ uninjected in the same orientation in Fig. 6A and B.

      5. Figure 7B lacks the schematic described in the figure legend.

      Significance

      This study presents novel insight into the developmentally important process of ciliogenesis in multiciliated cells that will be of specific interest to the fields of cilium biology and mechanobiology, with additional general interest in calcium signalling and cell biology.

    1. Reviewer #1: Evidentiary Rating: Potentially Informative

      Written Review: The authors claim that a spin-enhanced fluorescent nanodiamond (FND) lateral flow test for SARS-CoV-2 antigen detection achieves up to 1,100-fold greater sensitivity than conventional gold nanoparticle LFTs using identical antibodies. In a large, blinded clinical evaluation, the assay demonstrated 95.1% sensitivity (Ct ≤ 30) and 100% specificity, enabling SARS-CoV-2 detection on average two days earlier than conventional LFTs and within 0.6 days of RT-qPCR. They assert that this quantum-enhanced diagnostic platform could be adapted to other infectious and non-infectious diseases. 1. The study represents the first large-scale blinded clinical evaluation of spin-enhanced nanodiamond LFTs, moving beyond proof-of-concept to a performance assessment with real clinical samples. 2. The antibody screening process using biolayer interferometry is well executed, but screening on a single recombinant antigen source introduces potential epitope bias; inclusion of diverse antigen sources could further validate pair robustness. 3. The direct head-to-head comparison with in-house AuNP LFTs using identical reagents is a strong methodological choice, eliminating confounding variables common in cross-platform sensitivity claims. 4. While the assay achieves sub-pg/mL LoDs, residual non-specific binding limits ultimate sensitivity exploitation; more work on membrane chemistry or blocking strategies could push performance closer to the theoretical limit. 5. The sample size is adequate for preliminary evaluation, but a larger, more demographically and geographically diverse cohort is necessary to confirm real-world performance, especially in asymptomatic and early infection cases. 6. The assay’s 95.1% sensitivity at Ct ≤ 30 exceeds WHO “desirable” criteria, but the drop in sensitivity at lower viral loads (<10⁴ copies/mL) should be discussed in terms of balancing infectiousness detection with overdiagnosis risk. 7. The ROC, Bayesian regression, and infection dynamics modelling are sophisticated and well described, but providing raw Ct distribution histograms for positive samples would help readers assess viral load representativeness.

    1. Reviewer #1 (Public review):

      Summary:

      Participants learned a graph-based representation, but, contrary to the hypotheses, failed to show neural replay shortly after. This prompted a critical inquiry into temporally delayed linear modeling (TDLM)--the algorithm used to find replay. First, it was found that TDLM detects replay only at implausible numbers of replay events per second. Second, it detects replay-to-cognition correlations only at implausible densities. Third, there are concerning baseline shifts in sequenceness across participants. Fourth, spurious sequences arise in control conditions without a ground truth signal. Fifth, when reframing simulations previously published, similar evidence is apparent.

      Strengths:

      (1) This work is meticulous and meets a high standard of transparency and open science, with preregistration, code and data sharing, external resources such as a GUI with the task and material for the public.

      (2) The writing is clear, balanced, and matter-of-fact.

      (3) By injecting visually evoked empirical data into the simulation, many surface-level problems are avoided, such as biological plausibility and questions of signal-to-noise ratio.

      (4) The investigation of sequenceness-to-cognition correlations is an especially useful add-on because much of the previous work uses this to make key claims about replay as a mechanism.

      Weaknesses:

      Many of the weaknesses are not so much flaws in the analyses, but shortcomings when it comes to interpretation and a lack of making these findings as useful as they could be.

      (1) I found the bigger picture analysis to be lacking. Let us take stock: in other work, during active cognition, including at least one study from the Authors, TDLM shows significance sequenceness. But the evidence provided here suggests that even very strong localizer patterns injected into the data cannot be detected as replay except at implausible speeds. How can both of these things be true? Assuming these analyses are cogent, do these findings not imply something more destructive about all studies that found positive results with TDLM?

      (2) All things considered, TDLM seems like a fairly 'vanilla' and low-assumption algorithm for finding event sequences. It is hard to see intuitively what the breaking factor might be; why do the authors think ground truth patterns cannot be detected by this GLM-based framework at reasonable densities?

      (3) Can the authors sketch any directions for alternative methods? It seems we need an algorithm that outperforms TDLM, but not many clues or speculations are given as to what that might look like. Relatedly, no technical or "internal" critique is provided. What is it about TDLM that causes it to be so weak?

      Addressing these points would make this manuscript more useful, workable, and constructive, even if they would not necessarily increase its scientific breadth or strength of evidence.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary: The authors have previously published Mass-spectrometry data that demonstrates a physical interaction between Sall4 and the BAF chromatin complex in iPSC derived neurectodermal cells that are a precursor cell state to neural crest cells. The authors sought to understand the basis of this interaction and investigate the role of Sall4 and the BAF chromatin remodelling complex during neural crest cell specification. The authors first validate this interaction with a co-IP between ARID1B subunit and Sall4 confirming the mass spec data. The authors then utilise in silico modelling to identify the specific interaction between the BAF complex and Sall4, suggesting that this contact is mediated through the BAF complex member DPF2. To functionally validate the role of Sall4 during neural crest specification, the authors utilsie CRISPR-Cas9 to introduce a premature stop codon on one allele of Sall4 to generate iPSCs that are haploinsufficient for Sall4. Due to the reports of Sall4's role in pluripotency, the authors confirm that this model doesn't disrupt pluripotent stem cells and is viable to model the role of Sall4 during neural crest induction. The authors expand this assessment of Sall4 function further during their differentiation model to cranial neural crest cells, assessing Sall4 binding with Cut+Run sequencing, revealing that Sall4 binds to motifs that correspond to key genes in neural crest differentiation. Moreover, reduction in Sall4 expression also reduces the binding of the BAF complex, through Cut and Run for BRG1. Overall, the authors then propose a model by which Sall4 and BRG1 bind to and open enhancer regions in neurectodermal cells that enable complete differentiation to cranial neural crest cells.

      Overall, the data is clear and reproducible and offers a unique insight into the role of chromatin remodellers during cell fate specification.

      We thank the Reviewer for the nice words of appreciation of our manuscript.

      However, I have some minor comments.

      1- Using AlphaFold in silico modelling, he authors propose the interaction between the BAF complex with Sall4 is mediated by DPF2, but don't test it. Does a knockout, or knockdown of DPF2 prevent the interaction?

      We agree with the Reviewer that we are not functionally validating our computational prediction that DPF2 is the specific BAF subunit directly linking SALL4 with BAF. We chose not to perform the validation experiment for two main reasons:

      1) This would be outside of the scope of the paper. In fact, from a mechanistic point of view, we have confirmed via both Mass-spectrometry and co-IP with ARID1B that SALL4 and BAF interact in our system. Moreover, mechanistically we also extensively demonstrate that the interaction with SALL4 is required to recruit BAF at the neural crest induction enhancers and we further demonstrate that depletion of SALL4 impairs this. In our view, this was the focus of the manuscript. On the other hand, detecting with certainty which BAF subunit mediates the interaction with SALL4 would be outside the scope of the paper.

      2) Moreover, after careful consideration, we don’t think that even a knock-out of DPF2 would provide a definite answer to which exact BAF subunit mediates the interaction with SALL4. In fact, knock out of DPF2 could potentially disrupt BAF assembly or stability, and this could result in a disruption of the interaction with SALL4 even if DPF2 is not the very subunit mediating it (in other words the experiment could provide a false positive result). In our opinion, the only effective experiment would be mutating the DPF2 residues that we computationally predicted as responsible for the interaction with SALL4, but again this would be very laborious and out of the scope.

      That being said, we agree with the Reviewer that while the SALL4-BAF interaction was experimentally validated with robust approaches, the role of DPF2 in the interaction was only computationally predicted, which comes as a limitation of the study. We have now added a dedicated paragraph in the discussion to acknowledge such limitation.

      2- OPTIONAL: Does knockout of DPF2 phenocopy the Sall4 ko? This would be very interesting to include in the manuscript, but it would perhaps be a larger body of work.

      See point-1.

      3- Figure 1, the day of IP is not clearly described until later in the test. please outline during in the figure.

      We thank the Reviewer for pointing this out. This has been fixed.

      3- What is the expression of Sall1 (and other Sall paralogs) during differentiation. The same with the protein levels of Sall4, does this remain at the below 50%, or is this just during pluripotency?

      As Recommend by the Reviewer, we have performed time-course WB of SALL1 and SALL4. These experiments revealed that SALL1 remains very lowly expressed in wild-type conditions across time points and all the way through differentiation until CNCC (See updated supplementary Fig. S9). This is consistent with previous studies that demonstrated that SALL4, but not SALL1, is required for early mammalian development (see for example Miller et al. 2016, Development, and Koulle et al. 2025, Biorxiv). We performed the same time-course WB for SALL4 which revealed that SALL4 expression progressively decreases after day-5 (as expected) and it’s very low at CNCC stage (day-14), therefore we would expect the KO to remain at even lower level at this stage.

      4- The authors hypothesise that Sall4 binds to enhancers- with the criteria for an enhancer being that these peaks > 1KB from the TSS are enhancers. Can this be reinforced by overlaying with other ChIP tracks that would give more confidence in this? There are several datasets from Joanna Wysocka's lab that also utilise this protocol which can give you more evidence to reinforce the claim and provide further detail as to the role of Sall4.

      We thank the Reviewer for this great suggestion. As recommended, we have used publicly available ChIP-seq data generated by the Wysocka lab (H3K4me1, H3K4m3) and also generated new H3K27ac CHIP-seq data as well. These experiments and analyses confirmed that these regions are putative CNCC enhancers (and a minority of them putative promoters), decorated with H3K4me1 and with progressive increase in H3K27ac after CNCC induction (day-5). See new Supplementary Figure S6.

      5- The authors state that cells fail to become cranial neural crest cells, however they do not propose what the cells do instead. do they become neural? Or they stay at pluriopotent, which is one option given the higher expression of Nanog, OCT4 and OTX2 that are all expressed in pluripotent stem cells.

      We think that it is likely a mix of both. There is a mixed bag of expression of pluripotency markers, but also high expression of neuroectodermal markers. This suggests that most cells safely reach the neuroectodermal stage but fail to go beyond that, while some of the cells simply do not differentiate or regress back to pluripotency. We would rather refrain on overinterpreting what the KO-cells become, as it is likely an aberrant cell type, but following the Reviewer’s indication we have added a paragraph in the discussion to speculate on this.

      6- In general, I would like to see the gating strategy and controls for the flow cytometry in a supplemental figure.

      As Recommended by the Reviewer, we have added the gating strategy in the Supplementary Fig. S4.

      7- For supplementary figure 1- please include the gene names in the main image panels rather than just the germ layer.

      Done. The figure is now Supplementary Figure S3 since two supplementary figures were added before.


      Reviewer #2

      Summary In this manuscript, the authors build on their previous work (Pagliaroli et al., 2021) where they identified an interaction between the transcription factor SALL4 and the BAF chromatin remodeling complex at Day-5 of an iPSC to CNCC differentiation protocol. In their current work, the authors begin by exploring this interaction further, leveraging AlphaFold to predict interaction surfaces between SALL4 and BAF complex members, considering both SALL4 splice isoforms: a longer SALL4A (associated with developmental processes) and a shorter SALL4B (associated with pluripotency). They propose that SALL4A may interact with DPF2, a BAF complex member, in an isoform-dependent manner. The authors next explore the role of SALL4 in craniofacial development, motivated by patient heterozygous loss of function mutations, leveraging iPSC cells with an engineered SALL4 frameshift mutation (SALL4-het-KO). Using this model, the authors first demonstrate that a reduced expression of SALL4 does not impact the iPSC identity, perhaps due to compensation via upregulation of SALL1. Upon differentiation to neuroectoderm, SALL4 haploinsufficiency causes a reduction in newly accessible sites which are associated with a reduction in SALL4 binding and therefore a loss of BAF complex recruitment. Interestingly, however, there were few transcriptional changes at this stage. Later in the CNCC differentiation at Day-14 when the wildtype cells have switched expression of CNCC markers, the SALL4-het-KO cells fail to switch cadherin expression associated with a transition from epithelial to mesenchymal state, and fail to induce CNCC specification and post-migratory markers. Together the authors propose that SALL4 recruits BAF to CNCC enhancers as early as the neuroectodermal stage, and failure of BAF recruitment in SALL4-het-KO lines results in a loss of open chromatin at regulatory regions required later for induction of the CNCC programme. The failure of the later differentiation is compelling in the light of the early stages of the differentiation progressing normally, and the authors outline an interesting proposed mechanism whereby SALL4 recruits BAF to remodel chromatin ahead of CNCC enhancer activation, a model that can be tested further in future work. The link between SALL4 DNA binding and BAF recruitment is nicely argued, and very interesting as altered chromatin accessibility at Day 5 in the neuroectodermal stage is associated with only few changes in gene expression, while gene expression is greatly impacted later in the CNCC stage at Day 14. The in silico predictions of SALL4-BAF interaction interfaces are perhaps less convincing, requiring experimental follow-up outside the scope of this paper. Some of the associated figures could perhaps be moved to the supplement to enhance the focus on the later functional genomics experiments.

      We thank the Reviewer for the nice words of appreciation of our manuscript.

      Major comments

      1. A lot of emphasis is placed on the AlphaFold predictions in Figure 1, however the predictions in Figure 1B appear to be mostly low or very low confidence scores (coloured yellow and orange). It is unclear how much weight can be placed on these predictions without functional follow-up, e.g. mutating certain residues and showing impact on the interaction by co-IP. The latter parts of the manuscript are much better supported experimentally, and therefore perhaps some of the Figure 1 could move to a Supplemental Figure (e.g. the right-hand part of 1B, and the lower part of Figure 1C showing SALL4B predicted interactions). The limitations of AlphaFold predictions should be acknowledged and the authors should discuss how these predicted interactions could be experimentally explored further in the future.

      As recommended by the Reviewer, we have moved part of the AlphaFold predictions to Supplementary Figure S1, and we added a paragraph in the discussion to acknowledge the limitations of AlphaFold.

      The authors only show data for one heterozygous knockout clone for SALL4. It is usual to have more than one clone to mitigate potential clonal effects. The authors should comment why they only have one clone and include any data for a second clone for key experiments if they already have this. Alternatively, the authors could provide any quality control information generated during production of this line, for example if any additional genotyping was performed.

      We apologize for the confusion and for our lack of clarify on this. We have used two clones (one generated with a 11 bp deletion, one with a 19 bp deletion, both in exon-1, see also the point 6 of your minor points). The two clones were used as biological replicates, so for example the two ATAC-seq replicates performed in each time point were performed with the two different clones, and the three RNA-seq replicates were performed with two technical replicates of the clone with the 11bp deletion and one replicate with the clone with 19 bp deletion. We have clarified this in the methods section of the manuscript and added a Supplementary Figure (S2) showing the editing strategy for the two clones. Thank you for catching it.

      The authors show all genomics data (ATAC-seq, CUT&RUN and ChIP-seq) as heatmaps and average profiles. It would be valuable to see some representative loci for the ATAC seq (perhaps along with SALL4 and BRG1 recruitment) at some representative and interesting loci.

      As recommended by the Reviewer, we have added Genome Browser screenshots of representative loci in Fig. 6.

      Figure 4A. The schematic could be improved by including brightfield or immunofluorescent images at the three stages of the differentiation. Are the iPS cells seeded as single cells, or passaged as colonies before starting the differentiation. Further details are required in the methods to clarify how the differentiation is performed, for example at what Day are the differentiating cells passaged, this is not shown on the schematic in Figure 4A.

      As recommended, we added IF images in the Fig. 4A schematic, and added more details in the methods.

      There is likely some heterogeneity of cell types in the differentiation at Day 5 and Day 14. Can the authors comment on this from previous publications or perhaps conduct some IF for markers to demonstrate what proportions of cells are neuroectoderm at Day 5 and CNCCs at Day 14.

      The differentiation starts with single cells that aggregate to form neuroectodermal clusters, as per original protocol. The CNCCs that we obtain with this protocol homogeneously express CNCC markers, as shown by IF of SOX9 in Fig. 4A. For the day-5, as recommended we have added IF for PAX6 also showing homogeneous expression (Fig. 4A).

      For the motif analysis for Day 5-specific SALL4 binding sites (Figure 4E), was de novo motif calling performed? Were any binding sites reminiscent of a SALL4 binding site observed (e.g. an AT-rich motif)? Could the authors comment on this in the text - if there is no SALL4 binding motif, does this suggest SALL4 is recruited indirectly to these sites via interaction with another transcription factor for example?

      Similar to SALL4, SALL1 also recognizes AT-rich motifs. However, while we found AT-rich motifs as enriched in our day-5 motif analysis (in the regions that gain SALL4 binding upon differentiation), the enrichment is not particularly strong, and several other motifs are significantly more enriched, suggesting that, like the Reviewer mentioned, SALL4 might be recruited indirectly at these sites by other factors. We have added a paragraph on this in the discussion.

      Does SALL1 remain upregulated at Day-5 and Day-14 of the differentiation for the SALL4-het-KO line? Are binding sites known for this TF and were they detected in the motif analysis performed? Further discussion of the impact of the overexpression of SALL1 on the phenotypes observed is warranted - e.g. for Figure 5F, could the sites associated with a gain of BRG1 peaks upon loss of SALL4 be associated with SALL1 being upregulated and 'hijacking' BAF recruitment to distinct sites associated with nervous system development? Is SALL1 still upregulated at Day 5?

      As mentioned above, SALL1 also recognizes AT-rich motifs but similar to SALL4 also binds unspecifically, likely in cooperation with other TFs. Like the Reviewer suggested, it is certainly possible that some of the sites associated with a gain of BRG1 peaks upon loss of SALL4 could be associated with SALL1 being upregulated and 'hijacking' BAF recruitment to distinct sites. While this is speculative, we have added a paragraph on this in the discussion.

      Related to the point above, SALL4A is proposed to have an isoform-specific interaction with the BAF complex. It would be valuable to plot SALL4A and SALL4B expression from the available RNA-seq data at Day 0, 5 and 14 to explore whether stage-specific isoform expression matches with the proposed role of SALL4A to interact with BAF at Day 5. It could be valuable to also look at expression of SALL1, 2 and 3 across the time course to see whether additional compensation mechanisms are at play during the differentiation.

      Thanks for suggesting this. We performed a time course analysis of isoform specific gene expression, which showed that SALL4B expression remains low throughout differentiation, while SALLA4A expression increases upon differentiation cues and it remains at high levels until the end. We have added this to supplementary Fig. S9. Moreover, we have performed an additional experiment, using pomalidomide, which is a thalidomide derivative that selectively degrades SALL4A but not SALL4B. Notably, SALL4A degradation recapitulated the main findings obtained with the CRISPR-KO of SALL4, further supporting that SALL4A is the isoform involved in CNCC induction (see new Fig. 8).

      At line 264, The authors state "SALL4 recruits the BAF complex at CNCC developmental enhancers to increase chromatin accessibility". Given that this analysis is performed at Day 5 of the differentiation, which is labelled as neuroectoderm what evidence do the authors have that these are specifically CNCC enhancers? Statements relating to enhancers should generally be re-phrased to putative enhancers (as no functional evidence is provided for enhancer activity), and further evidence could be provided to support that these are CNCC-specific regulatory elements, e.g. showing representative gene loci from CNCC-specific genes. Discussion of the RNA-seq presented in Supplementary Figure 2B may also be appropriate to introduce here given that large numbers of accessible chromatin sites are detected while the expression of very few genes is impacted, suggesting these sites may become active enhancers at a later developmental stage.

      As also recommended by the other Reviewer, to further characterize these sites, we have used publicly available histone modification CHIP-seq data (H3K4me1, H3K4me3) generated by the Wysocka lab (H3K4me1, H3K4m3) and also generated new H3K27ac CHIP-seq data as well. These experiments and analyses confirmed that these regions are putative CNCC enhancers (and a minority of them putative promoters), all decorated with H3K4me1, and all showing progressive increase in H3K27ac after CNCC induction (day-5). See new Supplementary Figure S6.

      1. Do any of the putative CNCC enhancers detected at Day 5 as being sensitive to SALL4 downregulation and loss of BAF recruitment overlap with previously tested VISTA enhancers (https://enhancer.lbl.gov/vista/)?

      Yes, we have found examples of overlap and have included two of them in the updated Figure 6 as Genome Browser screenshots.

      Minor comments

      1. The authors are missing references in the introduction "a subpopulation of neural crest cells that migrate dorsolaterally to give rise to the cartilage and bones of the face and anterior skull, as well as cranial neurons and glia".

      Fixed, thank you.

      The discussion of congenital malformations associated with SALL4 haploinsufficiency is brief in the introduction. From OMIM, SALL4 heterozygous mutations are implicated with the condition Duane-radial ray syndrome (DRRS) with "upper limb anomalies, ocular anomalies, and, in some cases, renal anomalies... The ocular anomalies usually include Duane anomaly". That Duane anomaly is one phenotype among a number for patients with SALL4 haploinsufficiency could be clarified in the introduction. Of note, this is stated more clearly in the discussion but needs re-wording in the introduction.

      Done, thank you.

      The statements "show that the SALL4A isoform directly interacts with the BAF complex subunit DPF2 through its zinc-finger-3 domain" and "this interaction occurs between the zinc-finger-cluster-3 (ZFC3) domain of SALL4A and the plant homeodomains (PHDs) of DPF2" in the introduction appear overstated and should be toned down. To show this the authors would need to mutate or delete the proposed important zinc-finger domains from SALL4A, which is outside the scope of this work. Notably, this is less strongly-stated elsewhere in the manuscript, e.g "predict that this interaction is mediated by the BAF subunit DPF2", Line 162.

      Done, thank you.

      Could the authors clarify why 3 Alphafold output models are shown for SALL4B in Figure 1C, and only one output model for SALL4A?

      AlphaFold3 produces five separate predicted models per protein combination (e.g., Model_1 … Model_4), each derived from slightly different network parameters or initializations. The final output prioritizes the model with the highest confidence score. This multi-model strategy enables the identification of the most robust conformation while providing a measure of structural uncertainty (as per GitHub documentation for AlphaFold3). wE have conducted the same analysis for SALL4A as we did for SALL4B. Specifically, SALL4A interacts with the AT-rich DNA in models 0, 1, and 2, therefore models 3 and 4 were excluded. When analysing models 1 and 2, we found a higher number of residues involved in the interaction (>800 instead of 396). Similarly to model 0, only the interactions between residues belonging to an annotated functional domain (ZFs and PHDs) were considered.

      In Model 1: SALL4A and DPF2 interact mainly through ZF6 and 7, and not 5 as Model 0.

      In Model 2: SALL4A and DPF2 interact mainly through ZF5 and 6, and not 7 as Models 0. In contrast, this model shows an interaction with ZF1 not shown in the other two models, but with a higher PAE (31 average compared to 25 to 27 average of the other two ZFs.

      Therefore, we considered Model 0 as it is the model with higher confidence and representative of all significant models (includes ZF5, 6, and 7).

      Line 121. The authors state "DPF2, a broadly expressed BAF subunit,", but don't show expression during their CNCC differentiation. It would be good to include expression of DPF2 in Figure 1E.

      Done, thank you.

      The text states "a 11 bp deletion within the 3'-terminus of exon 1 of SALL4", while the figure legend states, "Sanger sequencing confirming the 19 bp deletion in one allele of SALL4 is displayed". The authors should clarify this disparity and experimentally confirm the deletion, e.g. by TA-cloning the two alleles and sequencing these separately to show that one allele is wildtype and the other has a frameshift deletion.

      We apologize for the confusion. As stated above (point-2 of the major comments), we have used two clones (one generated with a 11 bp deletion, one with a 19 bp deletion, both in exon-1, see also the point 6 of your minor points). The two clones were used as biological replicates (see response above for details). The deletion for both clones was experimentally confirmed by Sanger sequencing by the company that generated the lines for us (Synthego). The strategy for the two clones is now shown also in Supplementary Fig. S2.

      The authors generate an 11-bp (or 19-bp?) deletion in exon-1 - it would be valuable to include a discussion whether patients have been identified with deletions and frame-shift mutations in this region of SALL4 exon-1. And also clarify, if not clearly stated in the text, that both SALL4A and SALL4B will be impacted by this mutation. Are there examples of patient mutations which only impact SALL4A?

      As requested, we have added a discussion paragraph to discuss this. And, yes, both SALL4A and SALL4B are impacted by both deletions in both clones (11 bp and 19 bp deletion).

      Regarding patient variants on exon-1 and patient variants that only impact SALL4A. We could only find one published pathogenic 170bp deletion in exon 1 (VCV000642045.7). The majority of the pathogenic or likely pathogenic variances are located on exon2. In particular, of the 63 reported pathogenic (or likely pathogenic) clinical variants, 42 were located on exon 2. Among these, 28 are located in the portion shared by both SALL4A and SALL4B, while the remaining 14 were SALL4A specific.

      For the SALL4 blots in Figure 2B, is the antibody expected to detect both isoforms (SALL4A and SALL4B), and which isoform is shown? If two isoforms are detected, they should both be presented in the figure.

      Yes, the antibody detects both isoforms, and we now present both in the figure 2, as recommended.

      SALL4 expression should be shown for Figure 2C to see whether the >50% down-regulation of SALL4 at the protein level may be partially driven by transcriptional changes.

      Done, thank you. As expected, we observed the SALL4 mRNA expression in the KO line is comparable to wild-type conditions, but still this results in a significant decrease of the SALL4 protein level likely because of autoregulatory mechanisms coupled with non-sense mediated decay of the mutated allele. Also, we note that SALL4 usually makes homodimers, therefore lack of sufficient amount of protein could also lead to degradation of the monomers.

      The number of experimental replicates should be indicated in all figure legends where relevant. Raw data points should be plotted visibly over the violin plots (e.g. Figure 2C).

      Done, thank you.

      For Figure 3A, the images of the DAPI and NANOG/OCT4 staining should be shown separately in addition to the overlay.

      Done, thank you.

      The metric 'Corrected Total Cell Fluorescence (CTCF)' should be described in the methods. The number of images used for the quantification in Figure 3A should be

      Done, thank you.

      Figure 3C - what are the 114 differentially expressed genes? Some interesting genes could be labelled on the plot and the data used to generate this plot should be included as a Supplementary Table. Supplementary Tables should similarly be provided for Figure 6C, Day 14 and Supplementary Figure 2B, Day 5.

      As recommended, we have highlighted some interesting genes in the volcano plot and also included all the expression data for all genes in Supplementary Table S3.

      Figure 4B. The shared peaks are not shown. For completeness, it would be ideal to show these sites also.

      Done, thank you.

      Figure 4C is difficult to interpret. Why is the plot asymmetric to the left versus right? What does the axis represent - % of binding sites?

      The asymmetry is due to the fact that there is a larger number of peaks that are downstream of the TSS than peaks that are upstream of TSS. This is consistent with the fact that many SALL4 peaks are in introns, likely representing intronic enhancers.

      Line 224-225. What do n= 3,729 and n= 6,860 refer to? There appear to be many more binding sites indicated in Figure 4B, therefore these numbers cannot represent 86% and 97% of sites?

      Thank you for pointing this out, we should have specified in the text. Those numbers refer to the genes whose TSS is closest to each SALL4 peak. Notably, multiple peaks can share the same closest TSS, hence the discrepancy between # of peaks and # of nearest genes.

      Raw numbers:

      • Day-0 RAW = 6,104 (peaks = 6,114);
      • Day-5 RAW = 17,131 (peaks = 17,137). Now raw data reported in Supplementary Table 4.

      Figure 4E. Several TFs mentioned in the text (Line 243) are not shown in the figure, it would be good to show all TFs motifs mentioned in the text in this figure. Again, there is no mention of whether a sequence-specific motif is detected for SALL4 (e.g. an AT-rich sequence) from this motif analysis.

      Done, thank you. An AT-rich sequence, resembling the SALL4 motif, was detected in a small minority of sites (this is now shown in Supplementary Figure S5), suggesting that SALL4 engages chromatin in a broad manner, going beyond its preferred motif, possibly in cooperation with other TFs. This is consistent with many studies that in mESCs have shown that SALL4 binds at OCT4/NANOG/SOX2 target motifs. This is now discussed in a dedicated paragraph in the discussion.

      Figure 4G. How was the ATAC-seq data normalized for the WT and SALL4-het-KO lines for this comparison? The background levels of accessibility seem quite different in Replicate 1.

      The bigwigs used to make the heatmaps are normalized by sequencing depth using the Deeptools Suite (normalization by RPKM).

      Figures 5B-C could be exchanged to flow better with the text. A Venn diagram could be included to show the overlap between the sites losing BRG1 in SALL4-het-KO (13,505 sites) and the Day5-specific SALL4 sites (17,137 sites).

      Done, thank you.

      At Day 5, the authors suggest a shift towards neural differentiation. It could be interesting for the authors to perform qRT-PCR at Day 5 for some neural markers or look in the Day 14 data for markers of neural differentiation at the expense of CNCC markers.

      See updated Supplementary Fig. S8, where we show timecourse expression of several genes, including neural markers.

      Is the data used to plot Figure 5D the same as Figure 4G. If so, why is only one replicate shown in Figure 5D?

      Only one replicate was shown in the main figure purely for lack of space, but the experiment was replicated twice (with the two different clones), and the results were exactly the same. See plots below for your convenience:

      Figure 6A. How many replicates are shown? If n=2, boxplots are not an appropriate to represent the distribution of the data. Please include n= X in the figure legend and plot the raw data points also.

      Done, thank you, and as suggested we are no longer using boxplots for this panel.

      Figure 6B. What is the significance of CD99 for CNCC differentiation?

      Figure 6F. No error bars are shown, how many replicates were performed for this time couse? The linear regression line does not appear to add much value and could be removed.

      As suggested, we have removed these plots and replaced them with individual genes plots, which include error bars. See updated Supplementary Figure S8.

      At line 304, the authors state "while SALL4-het-KO showed a significant downregulation of these genes". Perhaps 'failed to induce these genes' may be more accurate unless they were expressed at Day 5 and downregulated at Day 14.

      Done, thank you.

      Lines 332-335. The genes selected for pluripotency, neural plate border, CNCC specification could be plotted separately in the Supplement to show individual gene expression dynamics.

      Done, thank you, see point 24.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors build on their previous work (Pagliaroli et al., 2021) where they identified an interaction between the transcription factor SALL4 and the BAF chromatin remodeling complex at Day-5 of an iPSC to CNCC differentiation protocol. In their current work, the authors begin by exploring this interaction further, leveraging AlphaFold to predict interaction surfaces between SALL4 and BAF complex members, considering both SALL4 splice isoforms: a longer SALL4A (associated with developmental processes) and a shorter SALL4B (associated with pluripotency). They propose that SALL4A may interact with DPF2, a BAF complex member, in an isoform-dependent manner. The authors next explore the role of SALL4 in craniofacial development, motivated by patient heterozygous loss of function mutations, leveraging iPSC cells with an engineered SALL4 frameshift mutation (SALL4-het-KO). Using this model, the authors first demonstrate that a reduced expression of SALL4 does not impact the iPSC identity, perhaps due to compensation via upregulation of SALL1. Upon differentiation to neuroectoderm, SALL4 haploinsufficiency causes a reduction in newly accessible sites which are associated with a reduction in SALL4 binding and therefore a loss of BAF complex recruitment. Interestingly, however, there were few transcriptional changes at this stage. Later in the CNCC differentiation at Day-14 when the wildtype cells have switched expression of CNCC markers, the SALL4-het-KO cells fail to switch cadherin expression associated with a transition from epithelial to mesenchymal state, and fail to induce CNCC specification and post-migratory markers. Together the authors propose that SALL4 recruits BAF to CNCC enhancers as early as the neuroectodermal stage, and failure of BAF recruitment in SALL4-het-KO lines results in a loss of open chromatin at regulatory regions required later for induction of the CNCC programme. The failure of the later differentiation is compelling in the light of the early stages of the differentiation progressing normally, and the authors outline an interesting proposed mechanism whereby SALL4 recruits BAF to remodel chromatin ahead of CNCC enhancer activation, a model that can be tested further in future work.

      Major comments

      The link between SALL4 DNA binding and BAF recruitment is nicely argued, and very interesting as altered chromatin accessibility at Day 5 in the neuroectodermal stage is associated with only few changes in gene expression, while gene expression is greatly impacted later in the CNCC stage at Day 14. The in silico predictions of SALL4-BAF interaction interfaces are perhaps less convincing, requiring experimental follow-up outside the scope of this paper. Some of the associated figures could perhaps be moved to the supplement to enhance the focus on the later functional genomics experiments.

      1. A lot of emphasis is placed on the AlphaFold predictions in Figure 1, however the predictions in Figure 1B appear to be mostly low or very low confidence scores (coloured yellow and orange). It is unclear how much weight can be placed on these predictions without functional follow-up, e.g. mutating certain residues and showing impact on the interaction by co-IP. The latter parts of the manuscript are much better supported experimentally, and therefore perhaps some of the Figure 1 could move to a Supplemental Figure (e.g. the right-hand part of 1B, and the lower part of Figure 1C showing SALL4B predicted interactions). The limitations of AlphaFold predictions should be acknowledged and the authors should discuss how these predicted interactions could be experimentally explored further in the future.
      2. The authors only show data for one heterozygous knockout clone for SALL4. It is usual to have more than one clone to mitigate potential clonal effects. The authors should comment why they only have one clone and include any data for a second clone for key experiments if they already have this. Alternatively, the authors could provide any quality control information generated during production of this line, for example if any additional genotyping was performed.
      3. The authors show all genomics data (ATAC-seq, CUT&RUN and ChIP-seq) as heatmaps and average profiles. It would be valuable to see some representative loci for the ATAC seq (perhaps along with SALL4 and BRG1 recruitment) at some representative and interesting loci.
      4. Figure 4A. The schematic could be improved by including brightfield or immunofluorescent images at the three stages of the differentiation. Are the iPS cells seeded as single cells, or passaged as colonies before starting the differentiation. Further details are required in the methods to clarify how the differentiation is performed, for example at what Day are the differentiating cells passaged, this is not shown on the schematic in Figure 4A.
      5. There is likely some heterogeneity of cell types in the differentiation at Day 5 and Day 14. Can the authors comment on this from previous publications or perhaps conduct some IF for markers to demonstrate what proportions of cells are neuroectoderm at Day 5 and CNCCs at Day 14.
      6. For the motif analysis for Day 5-specific SALL4 binding sites (Figure 4E), was de novo motif calling performed? Were any binding sites reminiscent of a SALL4 binding site observed (e.g. an AT-rich motif)? Could the authors comment on this in the text - if there is no SALL4 binding motif, does this suggest SALL4 is recruited indirectly to these sites via interaction with another transcription factor for example?
      7. Does SALL1 remain upregulated at Day-5 and Day-14 of the differentiation for the SALL4-het-KO line? Are binding sites known for this TF and were they detected in the motif analysis performed? Further discussion of the impact of the overexpression of SALL1 on the phenotypes observed is warranted - e.g. for Figure 5F, could the sites associated with a gain of BRG1 peaks upon loss of SALL4 be associated with SALL1 being upregulated and 'hijacking' BAF recruitment to distinct sites associated with nervous system development? Is SALL1 still upregulated at Day 5?
      8. Related to the point above, SALL4A is proposed to have an isoform-specific interaction with the BAF complex. It would be valuable to plot SALL4A and SALL4B expression from the available RNA-seq data at Day 0, 5 and 14 to explore whether stage-specific isoform expression matches with the proposed role of SALL4A to interact with BAF at Day 5. It could be valuable to also look at expression of SALL1, 2 and 3 across the time course to see whether additional compensation mechanisms are at play during the differentiation.
      9. At line 264, The authors state "SALL4 recruits the BAF complex at CNCC developmental enhancers to increase chromatin accessibility". Given that this analysis is performed at Day 5 of the differentiation, which is labelled as neuroectoderm what evidence do the authors have that these are specifically CNCC enhancers? Statements relating to enhancers should generally be re-phrased to putative enhancers (as no functional evidence is provided for enhancer activity), and further evidence could be provided to support that these are CNCC-specific regulatory elements, e.g. showing representative gene loci from CNCC-specific genes. Discussion of the RNA-seq presented in Supplementary Figure 2B may also be appropriate to introduce here given that large numbers of accessible chromatin sites are detected while the expression of very few genes is impacted, suggesting these sites may become active enhancers at a later developmental stage.
      10. Do any of the putative CNCC enhancers detected at Day 5 as being sensitive to SALL4 downregulation and loss of BAF recruitment overlap with previously tested VISTA enhancers (https://enhancer.lbl.gov/vista/)?

      Minor comments

      1. The authors are missing references in the introduction "a subpopulation of neural crest cells that migrate dorsolaterally to give rise to the cartilage and bones of the face and anterior skull, as well as cranial neurons and glia".
      2. The discussion of congenital malformations associated with SALL4 haploinsufficiency is brief in the introduction. From OMIM, SALL4 heterozygous mutations are implicated with the condition Duane-radial ray syndrome (DRRS) with "upper limb anomalies, ocular anomalies, and, in some cases, renal anomalies... The ocular anomalies usually include Duane anomaly". That Duane anomaly is one phenotype among a number for patients with SALL4 haploinsufficiency could be clarified in the introduction. Of note, this is stated more clearly in the discussion but needs re-wording in the introduction.
      3. The statements "show that the SALL4A isoform directly interacts with the BAF complex subunit DPF2 through its zinc-finger-3 domain" and "this interaction occurs between the zinc-finger-cluster-3 (ZFC3) domain of SALL4A and the plant homeodomains (PHDs) of DPF2" in the introduction appear overstated and should be toned down. To show this the authors would need to mutate or delete the proposed important zinc-finger domains from SALL4A, which is outside the scope of this work. Notably, this is less strongly-stated elsewhere in the manuscript, e.g "predict that this interaction is mediated by the BAF subunit DPF2", Line 162.
      4. Could the authors clarify why 3 Alphafold output models are shown for SALL4B in Figure 1C, and only one output model for SALL4A?
      5. Line 121. The authors state "DPF2, a broadly expressed BAF subunit,", but don't show expression during their CNCC differentiation. It would be good to include expression of DPF2 in Figure 1E.
      6. The text states "a 11 bp deletion within the 3'-terminus of exon 1 of SALL4", while the figure legend states, "Sanger sequencing confirming the 19 bp deletion in one allele of SALL4 is displayed". The authors should clarify this disparity and experimentally confirm the deletion, e.g. by TA-cloning the two alleles and sequencing these separately to show that one allele is wildtype and the other has a frameshift deletion.
      7. The authors generate an 11-bp (or 19-bp?) deletion in exon-1 - it would be valuable to include a discussion whether patients have been identified with deletions and frame-shift mutations in this region of SALL4 exon-1. And also clarify, if not clearly stated in the text, that both SALL4A and SALL4B will be impacted by this mutation. Are there examples of patient mutations which only impact SALL4A?
      8. For the SALL4 blots in Figure 2B, is the antibody expected to detect both isoforms (SALL4A and SALL4B), and which isoform is shown? If two isoforms are detected, they should both be presented in the figure.
      9. SALL4 expression should be shown for Figure 2C to see whether the >50% down-regulation of SALL4 at the protein level may be partially driven by transcriptional changes.
      10. The number of experimental replicates should be indicated in all figure legends where relevant. Raw data points should be plotted visibly over the violin plots (e.g. Figure 2C).
      11. For Figure 3A, the images of the DAPI and NANOG/OCT4 staining should be shown separately in addition to the overlay.
      12. The metric 'Corrected Total Cell Fluorescence (CTCF)' should be described in the methods. The number of images used for the quantification in Figure 3A should be indicated in the legend, and error bars included if multiple images were quantified.
      13. Figure 3C - what are the 114 differentially expressed genes? Some interesting genes could be labelled on the plot and the data used to generate this plot should be included as a Supplementary Table. Supplementary Tables should similarly be provided for Figure 6C, Day 14 and Supplementary Figure 2B, Day 5.
      14. Figure 4B. The shared peaks are not shown. For completeness, it would be ideal to show these sites also.
      15. Figure 4C is difficult to interpret. Why is the plot asymmetric to the left versus right? What does the axis represent - % of binding sites?
      16. Line 224-225. What do n= 3,729 and n= 6,860 refer to? There appear to be many more binding sites indicated in Figure 4B, therefore these numbers cannot represent 86% and 97% of sites?
      17. Figure 4E. Several TFs mentioned in the text (Line 243) are not shown in the figure, it would be good to show all TFs motifs mentioned in the text in this figure. Again, there is no mention of whether a sequence-specific motif is detected for SALL4 (e.g. an AT-rich sequence) from this motif analysis.
      18. Figure 4G. How was the ATAC-seq data normalized for the WT and SALL4-het-KO lines for this comparison? The background levels of accessibility seem quite different in Replicate 1.
      19. Figures 5B-C could be exchanged to flow better with the text. A Venn diagram could be included to show the overlap between the sites losing BRG1 in SALL4-het-KO (13,505 sites) and the Day5-specific SALL4 sites (17,137 sites).
      20. At Day 5, the authors suggest a shift towards neural differentiation. It could be interesting for the authors to perform qRT-PCR at Day 5 for some neural markers or look in the Day 14 data for markers of neural differentiation at the expense of CNCC markers.
      21. Is the data used to plot Figure 5D the same as Figure 4G. If so, why is only one replicate shown in Figure 5D?
      22. Figure 6A. How many replicates are shown? If n=2, boxplots are not an appropriate to represent the distribution of the data. Please include n= X in the figure legend and plot the raw data points also.
      23. Figure 6B. What is the significance of CD99 for CNCC differentiation?
      24. Figure 6F. No error bars are shown, how many replicates were performed for this time couse? The linear regression line does not appear to add much value and could be removed.
      25. At line 304, the authors state "while SALL4-het-KO showed a significant downregulation of these genes". Perhaps 'failed to induce these genes' may be more accurate unless they were expressed at Day 5 and downregulated at Day 14.
      26. Lines 332-335. The genes selected for pluripotency, neural plate border, CNCC specification could be plotted separately in the Supplement to show individual gene expression dynamics.

      Significance

      This work provides a conceptual advance in understanding the aetiology of human SALL4-mediated craniofacial malformations in a cell-type specific manner. Leveraging an in vitro differentiation system, the authors define development timepoints and cell types impacted by altered SALL4 dosage. Additionally, the authors provide interesting mechanistic insights how the teratogen thalidomide may impact craniofacial development through proteasomal targeting and degradation of SALL4, and subsequent impact on neural crest differentiation progression.

      Several audiences will be interested in this work: stem cell and developmental biologists (especially those interested in neural crest and facial development), and researchers interested in enhancer regulation, chromatin biology or gene regulatory mechanisms. Clinician scientists and geneticists will be interested in the proposed implications for mechanisms of disease.

      Field of expertise: We have expertise in mechanisms of gene regulation and in vitro models of early development. We are not experts in modeling protein interactions in silico.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: The authors have previously published Mass-spectrometry data that demonstrates a physical interaction between Sall4 and the BAF chromatin complex in iPSC derived neurectodermal cells that are a precursor cell state to neural crest cells. The authors sought to understand the basis of this interaction and investigate the role of Sall4 and the BAF chromatin remodelling complex during neural crest cell specification. The authors first validate this interaction with a co-IP between ARID1B subunit and Sall4 confirming the mass spec data. The authors then utilise in silico modelling to identify the specific interaction between the BAF complex and Sall4, suggesting that this contact is mediated through the BAF complex member DPF2. To functionally validate the role of Sall4 during neural crest specification, the authors utilsie CRISPR-Cas9 to introduce a premature stop codon on one allele of Sall4 to generate iPSCs that are haploinsufficient for Sall4. Due to the reports of Sall4's role in pluripotency, the authors confirm that this model doesn't disrupt pluripotent stem cells and is viable to model the role of Sall4 during neural crest induction. The authors expand this assessment of Sall4 function further during their differentiation model to cranial neural crest cells, assessing Sall4 binding with Cut+Run sequencing, revealing that Sall4 binds to motifs that correspond to key genes in neural crest differentiation. Moreover, reduction in Sall4 expression also reduces the binding of the BAF complex, through Cut and Run for BRG1. Overall, the authors then propose a model by which Sall4 and BRG1 bind to and open enhancer regions in neurectodermal cells that enable complete differentiation to cranial neural crest cells.

      Overall, the data is clear and reproducible and offers a unique insight into the role of chromatin remodellers during cell fate specification.

      However, I have some minor comments.

      1. Using AlphaFold in silico modelling, he authors propose the interaction between the BAF complex with Sall4 is mediated by DPF2, but don't test it. Does a knockout, or knockdown of DPF2 prevent the interaction?
      2. OPTIONAL: Does knockout of DPF2 phenocopy the Sall4 ko? This would be very interesting to include in the manuscript, but it would perhaps be a larger body of work.
      3. Figure 1, the day of IP is not clearly described until later in the test. please outline during in the figure

      3- What is the expression of Sal1 (and other Sall paralogs) during differentiation. The same with the protein levels of Sall4, does this remain at the below 50%, or is this just during pluripotency? 4. The authors hypothesise that Sall4 binds to enhancers- with the criteria for an enhancer being that these peaks > 1KB from the TSS are enhancers. Can this be reinforced by overlaying with other ChIP tracks that would give more confidence in this? There are several datasets from Joanna Wysocka's lab that also utilise this protocol which can give you more evidence to reinforce the claim and provide further detail as to the role of Sall4 5. The authors state that cells fail to become cranial neural crest cells, however they do not propose what the cells do instead. do they become neural? Or they stay at pluriopotent, which is one option given the higher expression of Nanog, OCT4 and OTX2 that are all expressed in pluripotent stem cells. 6. In general, I would like to see the gating strategy and controls for the flow cytometry in a supplemental figure. 7. For supplementary figure 1- please include the gene names in the main image panels rather than just the germ layer.

      Significance

      The strength of this study lies in its well-designed and clearly presented experiments and datasets. In particular, identifying the specific SALL4 isoform that interacts with the BAF complex-and further exploring the implications of this interaction-is a major highlight. The authors also make effective use of in silico modelling with AlphaFold, offering valuable mechanistic insight into how this interaction is mediated.

      The topic should have appeal to researchers in developmental biology and epigenetics. This study represents a significant step forward in validating the interaction between SALL4 and the BAF complex, and it highlights the requirement of SALL4 for BAF-mediated chromatin remodelling during neural crest specification. These findings are likely to be of interest to those studying the gene regulatory mechanisms underlying craniofacial development.

      However, while the authors outline the roles of SALL4 and the BAF complex in chromatin remodeling during neural crest development, the downstream effects on cell fate specification could be more thoroughly examined. Currently, Gene Ontology analysis is the primary method used to interpret these consequences, and additional functional validation would strengthen the conclusions.

      Intended audience: Basic research, epigenetics in pluripotency and neural crest development.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this manuscript, Singh, Wu and colleagues explore functional links between septins and the exocyst complex. The exocyst in a conserved octameric complex that mediates the tethering of secretory vesicles for exocytosis in eukaryotes. In fission yeast cells, the exocyst is necessary for cell division, where it localizes mostly at the rim of the division plane, but septins, which localize in a similar manner, are non-essential. The main findings of the work are that septins are required for the specific localization of the exocyst to the rim of the division plane, and the likely consequent localization of the glucanase Eng1 at this same location, where it is known to promote cell separation. In the absence of septins, the exocyst still localizes to the division plane but is not restricted to the rim. They also show some defects in the localization of secretory vesicles and glucan synthase cargo. They further propose that interactions between septins and exocysts are direct, as shown through Alphafold2 predictions (of unclear strength) and clean coIP experiments. 

      Strengths: 

      The septin, exocyst and Eng1 localization data are well supported, showing that the septin rim recruits the exocyst and (likely consequently) the Eng1 glucanase at this location. One major finding of the manuscript is that of a physical interaction between septins and exocyst subunits. Indeed, many of the coIPs supporting this discovery are very clear. 

      Weaknesses: 

      I am less convinced by the strength of the physical interaction of septins with the exocyst complex. Notably, one important open question is whether septins interact with the intact exocyst complex, as claimed in the text, or whether the interactions occur only with individual subunits. The two-hybrid and coIP data only show weak interactions with individual subunits, and some coIPs (for instance Sec3 and Exo70 with Spn1 and Spn4) are negative, suggesting that the exocyst complex does not remain intact in these experiments.

      Given the known structure of the full exocyst complex and septin filaments (at least in S. cerevisiae), the Alphafold2 predicted structure could be used to probe whether the proposed interaction sites are compatible with full complex formation.  

      We thank the reviewer for these important and insightful comments. We agree that our current data, particularly the data from yeast two-hybrid and co-immunoprecipitation (coIP) assays, primarily reveal interactions between individual septin and exocyst subunits, and do not conclusively demonstrate binding of septins to the fully assembled exocyst complex. We realize this as a key limitation and have revised the manuscript text accordingly to clarify this point.

      We also appreciate the reviewer’s suggestion to use structural prediction to further assess their interaction plausibility. We have now employed the full Saccharomyces cerevisiae exocyst complex (with 4.4 Å resolution) published by the Guo group (Mei et al., 2018) to examine the interfaces of septin and the exocyst interactions, assuming that the S. pombe exocyst has the similar structure. We focused on checking all the interacting residues on the exocyst complex and septins from our AlphaFold modeling to determine whether these predicted interactions are structurally compatible. Our analysis reveals that majority subunit interactions are sterically feasible, while a few would likely require partial disassembly or flexible conformations. These new insights have been added to the revised Results and Discussion sections (Figure Supplement S4, S5 and Videos 4-7).

      While we cannot fully resolve whether septins engage with the whole exocyst complex versus selected subunits, our combined data support a model that septins scaffold or spatially regulate the exocyst localization at the division site, potentially through dynamic and multivalent interactions. We now explicitly state this more cautious interpretation in the revised manuscript.

      Mei, K., Li, Y., Wang, S., Shao, G., Wang, J., Ding, Y., Luo, G., Yue, P., Liu, J.-J., Wang, X. and Dong, M.-Q., Wang, H-W, Guo W. 2018. Cryo-EM structure of the exocyst complex. Nature Struct & Mol. Biol, 25(2), pp.139-146.

      The effect of spn1∆ on Eng1 localization is very clear, but the effect on secretory vesicles (Ypt3, Syb1) and glucan synthase Bgs1 is less convincing. The effect is small, and it is not clear how the cells are matched for the stage of cytokinesis. 

      For localizations and quantifications of Eng1, Ypt3, Syb1, and Bgs1 shown in Figures 6 and 7, cells with a closed septum (at or after the end of contractile-ring constriction) were quantified or highlighted. To quantify their fluorescence intensity at the division site using line scan, the line width used was 3 pixels. For Syb1 (Figure 6D), we quantified cells at the end of ring constriction (when Rlc1-tdTomato constricted to a dot) in the middle focal plane. The exact same lines were drawn in both Rlc1 and Syb1 channels. The center of line scan was defined as the pixel with the brightest Rlc1 value. All data were aligned by the center and plotted. For Bgs1 (Figure 7A), we quantified the cells that Rlc1 signal had disappeared from the division site. The line was drawn in the Bgs1 channel in the middle focal plane. The center of line scan was defined as the pixel with the brightest Bgs1 value.

      All data were aligned by the center and plotted. These details were added to the Materials and Methods.

      Reviewer #2 (Public Review): 

      Summary: 

      This interesting study implicates the direct interaction between two multi-subunit complexes, known as the exocyst and septin complexes, in the function of both complexes during cytokinesis in fission yeast. While previous work from several labs had implicated roles for the exocyst and septin complexes in cytokinesis and cell separation, this study describes the importance of protein:protein interaction between these complexes in mediating the functions of these complexes in cytokinesis. Previous studies in neurons had suggested interactions between septins and exocyst complexes occur but the functional importance of such interactions was not known. Moreover, in baker's yeast where both of these complexes have been extensively studied - no evidence of such an interaction has been uncovered despite numerous studies which should have detected it. Therefore while exocyst:septin interactions appear to be conserved in several systems, it appears likely that budding yeast are the exception--having lost this conserved interaction. 

      Strengths: 

      The strengths of this work include the rigorous analysis of the interaction using multiple methods including Co-IP of tagged but endogenously expressed proteins, 2 hybrid interaction, and Alphafold Multimer. Careful quantitative analysis of the effects of loss of function in each complex and the effects on localization and dynamics of each complex was also a strength. Taken together this work convincingly describes that these two complexes do interact and that this interaction plays an important role in post Golgi vesicle targeting during cytokinesis. 

      Weaknesses: 

      The authors used Alphafold Multimer to predict (largely successfully) which subunits were most likely to be involved in direct interactions between the complexes. It would be very interesting to compare this to a parallel analysis on the budding yeast septin and exocyst complexes where it is quite clear that detectable interactions between the exocyst and septins (using the same methods) do not exist. Presumably the resulting pLDDT scores will be significantly lower. These are in silico experiments and should not be difficult to carry out. 

      We thank the reviewer for this insightful suggestion. To assess the specificity of the predicted interactions between septins and the exocyst complex in S. pombe, we performed a comparative AlphaFold2 analysis using some of the homologous subunits from Saccharomyces cerevisiae. We modeled two interactions between Cdc10-Sec5 and Cdc10-Sec15 (Cdc10 is the Spn2 homolog) using the same pipeline and parameters at the time when we did the modeling for S. pombe. We did not find interactions between them using the criteria we used for the fission yeast proteins in this study. These results support the notion that the predicted septin–exocyst interactions in S. pombe are not generalizable to budding yeast. Unfortunately, we did not test all other combinations at that time and the AlphaFold2 platform is not available to us now (showing system error messages when we tried recently). We thank the reviewer again for this helpful suggestion, which should strengthen the evolutionary interpretation of the septin-exocyst interactions once it is able to be systematically carried out.

      Reviewer #3 (Public Review): 

      Septins in several systems are thought to guide the location of exocytosis, and they have been found to interact with the exocyst vesicle-tethering complex in some cells. However, it is not known whether such interactions are direct or indirect. Moreover, septin-exocyst physical associations were not detected in several other systems, including yeasts, making it unclear whether such interactions reflect a conserved septin-exocytosis link or whether they may missed if they depend on septin polymerization or association into higher-order structures. Singh et. al., set out to define whether and how septins influence the exocyst during S. pombe cytokinesis. Based on three lines of evidence, the authors conclude that septins directly bind to exocyst subunits to regulate localization of the exocyst and vesicle secretion during cytokinesis. The conclusions are consistent with the data presented, but some interpretations need to be clarified and extended: 

      (1) The first line of evidence examines septin and exocyst localization during cytokinesis in wild-type and septin-mutant or exocyst-mutant yeast. Quantitative imaging convincingly shows that the detailed localization of the exocyst at the division site is perturbed in septin mutants, and that this is accompanied by modest accumulation of vesicles and vesicle cargos. Whether that is sufficient to explain the increased thickness of the division septum in septin mutants remains unclear.

      The modest accumulation of vesicles and vesicle cargos at the division site is one of the reasons for the increased thickness of the division septum in septin mutants. It is more likely that the misplaced exocyst can still tether vesicles along the division plane (less likely at the rim) without septins. Due to the lack of the glucanase Eng1 at the rim of the division plane in septin mutants, daughter-cell separation is delayed and then cells continue to thicken the septum. We have added these points to the Discussion.

      (2) The second line of evidence involves a comprehensive Alphafold2 analysis of potential pair-wise interactions between septin and exocyst subunits. This identifies several putative interactions in silico, but it is unclear whether the identified interaction surfaces would be available in the full septin or exocyst complexes.  

      We thank the reviewer for raising this important point. We fully agree that a key limitation of pairwise AlphaFold predictions is that they do not account for the higher-order structural context of multimeric protein complexes, such as septin hetero-oligomers or the assembled exocyst complex. As a result, some of the predicted interfaces could indeed be conformationally restricted in the native state.

      To address this concern, we predicted the S. pombe exocyst and septin structures using AlphaFold3. We mapped predicted contact residues onto the predicted structure. Most predicted interfaces (86% for the exocyst and 86-96% for septins) appear to be located on accessible surfaces in the assembled complexes (Figure supplement S4, S5, videos 4 - video 7), suggesting that these interactions are sterically plausible. We have added this important caveat to the text of the revised manuscript highlighting the interface accessibility within the assembled complexes. We appreciate the reviewer’s insight, which helped us strengthen the interpretation and limitations of the AlphaFold-based analysis.

      (3) The third line of evidence uses co-immunoprecipitation and yeast two hybrid assays to show that several physical interactions predicted by Alphafold2 can be detected, leading the authors to conclude that they have identified direct interactions. However, both methods leave open the possibility that the interactions are indirect and mediated by other proteins in the fission yeast extract (co-IP) or budding yeast cell (two-hybrid). 

      We thank the reviewer for this important clarification. We agree that coimmunoprecipitation (co-IP) and yeast two-hybrid (Y2H) assays cannot conclusively distinguish between direct and indirect interactions. As the reviewer points out, co-IPs may reflect associations mediated by bridging proteins within the fission yeast extract, and Y2H readouts can be influenced by fusion context or endogenous host proteins. In our manuscript, we have now revised the relevant statements in the Results and Discussion sections to clarify that the observed associations are consistent with direct interactions predicted by AlphaFold2, but cannot alone establish direct binding. We have also tempered our terminology—substituting phrases such as “direct interaction” with “physical association consistent with direct binding,” where appropriate.

      (4) Based on prior studies it would be expected that the large majority of both septins and exocyst subunits are present in cells and extracts as stoichiometric complexes. Thus, one would expect any septin-exocyst interaction to yield associations detectable with multiple subunits, yet co-IPs were not detected in some combinations. It is therefore unclear whether the interactions reflect associations between fully-formed functional complexes or perhaps between transient folding intermediates. 

      We thank the reviewer for this thoughtful observation. We agree that both septins and exocyst subunits are generally understood to exist in cells as stable, stoichiometric complexes, and that interactions between fully assembled complexes might be expected to yield co-immunoprecipitation signals involving multiple subunits from each complex. However, it was also found that >50% of septins Spn1 and Spn4 are in the cytoplasm even during cytokinesis when the septin double rings are formed (Table 1 of Wu and Pollard, Science 2005, PMID: 16224022). Thus, it is possible that there are pools of free septin and exocyst subunits in the cytoplasm, which were detected in our Co-IP assays. 

      In our experiments, we observed selective co-IP signals between certain septin and exocyst subunits, while other combinations did not yield detectable interactions. We believe these findings could reflect several other possibilities besides the possible interactions among the free subunits in the cytoplasm:

      (1) Some interactions may only be strong enough between specific subunits at exposed interfaces under the Co-IP conditions, rather than through wholesome complex–complex interactions;

      (2) The detergent and/or salt conditions used in our co-IPs may disrupt labile complex interfaces or partially dissociate multimeric assemblies.

      To address this concern, we now include in the Discussion a paragraph highlighting the possibility that some of the observed interactions may not reflect binding between fully assembled, functional complexes. Notably, most detected interactions pairs are consistent with the AlphaFold predictions, which suggest specific subunit interfaces may be responsible for mediating contact. While we cannot fully resolve whether septins engage with the whole exocyst complex versus selected subunits, our combined data supports a model that septins scaffold or spatially regulate the exocyst localization at the division site, potentially through dynamic and multivalent interactions. We now explicitly state this more cautious interpretation in the revised manuscript. Future biochemical studies using native complex purifications, cross-linking mass spectrometry, or in vitro reconstitution with fully assembled septin and exocyst complexes, or in vivo FRET assays will be essential to clarify whether the interactions we observe occur between intact assemblies or intermediate forms.

      Reviewer #1 (Recommendations for the Authors): 

      A major finding from the manuscript is the description of physical interaction of septin subunits with exocyst subunits. The analysis starts from Alphafold2 predictions, shown in Figures 3 and S3. However, some of the most useful metrics of Alphafold, the PAE plot and the pTM and ipTM values, are not provided. It is thus very difficult to estimate the value of the predicted structures (which are also obscured by all side chains). The power of a predicted structure is that it suggests binding interfaces, which is not explored here. At the very least, it would not be difficult to examine whether the proposed binding interfaces are free in the septin filaments and octameric exocyst complex. 

      Please also see response to reviewer #1 (Public Review).

      We thank the reviewer for these very helpful suggestions. We agree that inclusion of AlphaFold2 model confidence metrics—specifically the Predicted Aligned Error (PAE) plots, as well as pTM and ipTM values—is essential for evaluating the reliability of the predicted septin–exocyst interfaces.

      In the revised manuscript, we have now included the PAE plots (Figure 3 and Supplementary S3) and summarizes the pTM scores for each predicted septin–exocyst subunit pair. We also provide a short description of these metrics in the figure legend to help guide interpretation. The old Alphafold2 version (alphafold2advanced) that we used doesn’t give iPTM score, so are not included. However, according to our methodology, we only counted the interacting residues which have pLDDT scores >50%, predicting the resulting iPTM score should not be very weak.

      In addition, we have updated Figures 3 and S3 to show simplified ribbon diagrams of the interface regions, with side chains hidden by default and selectively displayed only at predicted interaction hotspots. This improves structural clarity and makes the interface regions easier to interpret. We mentioned in the Discussion that the preliminary studies show that the predicted interacting interfaces of Sec15 and Sec5 with septin subunits are accessible for interaction in the whole exocyst complex. The new Figure Supplement S4 and S5 and Videos 4-7 now show the interface residues of both the exocyst and septins that are involved in the interactions.

      Two further points on the interaction: 

      The 2H interaction data is not very convincing. The insets showing beta-gal assays do not look very different from the negative control (compare for instance in panel 4E the Sec15BD alone, last column, with the Sec15-BD in combination with Spn4-AD, third column: roughly same color), which suggests it is mostly driven by autoactivation of Sec15-BD. Providing growth information in addition to beta-gal may be helpful. 

      We appreciate the reviewer’s close evaluation of the yeast two-hybrid (Y2H) assay data, and we agree that the signals observed in the Spn4–Sec15 combination is indeed weak. Unfortunately, we did not perform growth assays. However, we would like to clarify that this is consistent with the nature of the interactions that we are investigating. The interaction between individual septin and exocyst subunits is not strong and/or transient as supported by the weak interactions by Co-IP experiments. Given the exocyst only tethers/docks vesicles on the plasma membrane for tens of seconds before vesicle fusion, the multivalent interactions between septins and the exocyst should be very dynamic and not be too strong. 

      As evidenced by our Co-IP experiments and multivalent interactions predicted by Alphafold2, the interaction between Spn4 and Sec15 is detectable but weak, suggesting that this may be a low-affinity or transient interaction. Given that Y2H assays have known limitations in detecting such low-affinity interactions—especially those that depend on conformational context or are not optimal in the yeast nucleus—it is perhaps not surprising that the X-gal color development is subtle. These limitations of the Y2H system have been well-documented (e.g., Braun et al., 2009; Vidal & Fields, 2014), particularly for interactions with affinities in the micromolar range or those requiring conformational specificity. Therefore, the weak signal observed is in line with expectations for a lowaffinity, transient interaction such as between Spn4 and Sec15.

      Vidal, M. and Fields, S., 2014. The yeast two-hybrid assay: still finding connections after 25 years. Nature methods, 11(12), pp.1203-1206.

      Braun, P., Tasan, M., Dreze, M., Barrios-Rodiles, M., Lemmens, I., Yu, H., Sahalie, J.M., Murray, R.R., Roncari, L., De Smet, A.S. and Venkatesan, K., 2009. An experimentally derived confidence score for binary protein-protein interactions. Nature methods, 6(1), pp.91-97.

      In the coIP experiments, I am confused by the presence of tubulin signal in some of the IPs. For instance, in Fig 4B, but not 4D, where the same Sec15-GFP is immunoprecipitated. There is also a signal in 4C but not 4A. This needs to be clarified. 

      The presence of tubulin in some immunoprecipitates is not unexpected, particularly in experiments involving cytoskeleton-associated proteins such as septins and exocyst subunits. The occasional presence of tubulin in our co-IP samples is consistent with well-documented reports showing tubulin as a frequent non-specific co-purifying protein, particularly under native lysis conditions used to preserve large complexes (Vega and Hsu, 2003; Gavin et al., 2006; Mellacheruvu et al., 2013; Hein et al., 2015). The CRAPome database and quantitative interactomics studies highlight tubulin as one of the most common background proteins in affinity-based workflows. Importantly, tubulin was used as a loading control but not as a marker for interaction in our study, and its variable presence does not reflect a specific interaction with Sec15-GFP or other bait proteins, and we have clarified this point in the revised figure legend.

      Gavin, A.C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dümpelfeld, B. and Edelmann, A., 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084), pp.631-636.

      Mellacheruvu, D., Wright, Z., Couzens, A.L., Lambert, J.P., St-Denis, N.A., Li, T., Miteva, Y.V., Hauri, S., Sardiu, M.E., Low, T.Y. and Halim, V.A., 2013. The CRAPome: a contaminant repository for affinity purification–mass spectrometry data. Nature methods, 10(8), pp.730736.

      Hein, M.Y., Hubner, N.C., Poser, I., Cox, J., Nagaraj, N., Toyoda, Y., Gak, I.A., Weisswange, I., Mansfeld, J., Buchholz, F. and Hyman, A.A., 2015. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell, 163(3), pp.712-723.

      Vega, I.E., Hsu, S.C. 2003. The septin protein Nedd5 associates with both the exocyst complex and microtubules and disruption of its GTPase activity promotes aberrant neurite sprouting in PC12 cells. Neuroreport, 14, pp.31-37.

      Regarding the localization of Ypt3 and Syb1 in WT and spn1∆ in Figure 6C-D and Bgs1 in Figure 7A, it would help to add a contractile ring marker to be able to match the timing of cytokinesis between WT and mutants and ensure that cells of same stage are compared (and add some quantification for Ypt3). In fact, in Figure 7A, next to the cells being pointed at, there are very similar localizations of Bgs1 in WT and spn1∆ at the rim of the ingressing septum, which makes me wonder how the quantified cells were chosen. 

      For localizations and quantifications of Eng1, Ypt3, Syb1, and Bgs1 shown in Figures 6 and 7, cells with a closed septum (at or after the end of contractile-ring constriction) were quantified or highlighted. To quantify their fluorescence intensity at the division site using line scan, the line width used was 3 pixels. For Syb1 (Figure 6D), we quantified cells at the end of ring constriction (when Rlc1-tdTomato constricted to a dot) in the middle focal plane. The exact same lines were drawn in both Rlc1 and Syb1 channels. The center of line scan was defined as the pixel with the brightest Rlc1 value. All data were aligned by the center and plotted. For Bgs1 (Figure 7A), we quantified the cells that Rlc1 signal had disappeared from the division site. The line was drawn in the Bgs1 channel in the middle focal plane. The center of line scan was defined as the pixel with the brightest Bgs1 value. All data were aligned by the center and plotted. These details were added to the Materials and Methods.

      Finally, the manuscript would benefit from some figure reorganization/compaction. Unless work on the binding interfaces is added, Figure 3 and S3 could be removed and summarized by providing the pTM and ipTM values of the predicted interactions. Figure 5 could be combined with Figure 2, as it is essentially a repeat with additional exocyst subunits. 

      Because the binding interfaces are added, we keep the original Figures 3 and S3. The experiments in Figure 5 could not be performed before the interaction tests between septins and the exocyst. Thus, to aid the flow of the story, we keep Figures 2 and 5 separated.

      Minor comments: 

      The last sentence of the first paragraph of the results does not make much sense at this point of the paper. After the first paragraph, there is no evidence that colocalization would be required for proper function.  

      We agree that the sentence in question may have overstated the functional implications of colocalization too early in the Results section, before presenting supporting evidence. Our intention was to introduce the hypothesis that spatial proximity between septins and exocyst subunits may be relevant for their coordination during cytokinesis, which we examine in later figures. We have revised the sentence to more accurately reflect the observational nature of the data at this stage in the manuscript as below:

      "These observations suggest the spatial proximity between septins and the exocyst during certain stage of cytokinesis, raising the possibility of their functional coordination, which we would further investigate below."

      What is the indicated n in Figure 6B? Number of cells? 

      Yes, the n in Figure 6B refers to the thin sections of electron microscopy quantified in the analysis. We have now updated the figure legend to explicitly state this for clarity.

      The causal inference made between the alteration of Exocyst localization in septin mutants and the thicker septum is possible, but by no means certain. It should be phrased more cautiously. 

      We agree that our original phrasing may have overstated the causal relationship between altered exocyst localization in septin mutants and septum thickening. Our data supports a correlation between these phenotypes, but additional experiments would be required to establish direct causality.

      To reflect this, we have revised the relevant sentence in the Discussion to read:

      “The modest accumulation of vesicles and vesicle cargos at the division site is one of the reasons for the increased thickness of the division septum in septin mutants. It is more likely that the misplaced exocyst can still tether vesicles along the division plane without septins. Due to the lack of the glucanase Eng1 at the rim of the division plane in septin mutants, daughter-cell separation is delayed and then cells continue to thicken the septum.”

      Reviewer #2 (Recommendations for the Authors): 

      (1) In the display of the AlphaFold Model for the interactions (Figure 3 and Supplemental Figure 3) it is difficult to identify which subunits are where. Residue numbers and subunits should be labeled and only side chains important for the interactions should be present in the model. 

      We appreciate this valuable suggestion. We agree that clearer visual labeling is essential for interpreting the predicted interactions and have revised Figures 3 and S3 accordingly to improve readability and emphasize key structural features.

      Specifically, we have:

      • Labeled each subunit with its name and color-coded consistently across panels.

      •  Annotated key interface residues with residue numbers directly in the figure.

      • Removed non-interacting side chains to declutter the model and highlight only those involved in predicted interactions as well as expanded the figure legend for explanation.

      (2) In Table 1 the column label "Genetic Interaction at 25C" is confusing when synthetic growth defects are shown with a "plus". Rather this column could be labeled "Growth of double mutants at 25C" and then designate the relative growth rate observed at 25C as in Table 2. Designating a negative effect on growth with a plus is confusing. 

      Thanks for the thoughtful suggestions. We have made the suggested changes by deleting the last column so that Tables 1 and 2 are consistent.

      (3) In Figure 4, why is tubulin being co-immunoprecipitated in two of the four anti-GFP IPs? Are the IPs dirty and if so why does it vary between the four experiments? If they are dirty can the non-specific tubulin be removed by additional washes with IP buffer or conversely is it necessary to do minimal washes in order to detect the exocyst-septin interaction by coIP? A comment on this would be helpful. 

      The presence of tubulin in some immunoprecipitates is not unexpected, particularly in experiments involving cytoskeleton-associated proteins such as septins and exocyst subunits. The occasional presence of tubulin in our co-IP samples is consistent with welldocumented reports showing tubulin as a frequent non-specific co-purifying protein, particularly under native lysis conditions used to preserve large complexes (Vega and Hsu, 2003; Gavin et al., 2006; Mellacheruvu et al., 2013; Hein et al., 2015). The CRAPome database and quantitative interactomics studies highlight tubulin as one of the most common background proteins in affinity-based workflows. Importantly, tubulin was used as a loading control but not marker for interaction in our study, and its variable presence does not reflect a specific interaction with Sec15-GFP or other bait proteins, and we have clarified this point in the revised figure legend.

      Gavin, A.C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dümpelfeld, B. and Edelmann, A., 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084), pp.631-636.

      Mellacheruvu, D., Wright, Z., Couzens, A.L., Lambert, J.P., St-Denis, N.A., Li, T., Miteva, Y.V., Hauri, S., Sardiu, M.E., Low, T.Y. and Halim, V.A., 2013. The CRAPome: a contaminant repository for affinity purification–mass spectrometry data. Nature methods, 10(8), pp.730736.

      Hein, M.Y., Hubner, N.C., Poser, I., Cox, J., Nagaraj, N., Toyoda, Y., Gak, I.A., Weisswange, I., Mansfeld, J., Buchholz, F. and Hyman, A.A., 2015. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell, 163(3), pp.712-723.

      Vega, I.E., Hsu, S.C. 2003. The septin protein Nedd5 associates with both the exocyst complex and microtubules and disruption of its GTPase activity promotes aberrant neurite sprouting in PC12 cells. Neuroreport, 14, pp.31-37. 

      In response to the second part of reviewer’s comment, we washed the pulldown product for 5 times each time with 1 ml IP buffer at 4ºC. We used this standard protocol for all the Co-IP experiments to detect the interaction between different septin-exocyst subunits. So, we are not sure if and how more washes or more stringent buffer conditions can interfere with detection of the interactions.

      Reviewer #3 (Recommendations for the Authors): 

      In addition to the issues noted in the public review, there were some confusing findings and references to previous literature that merit further consideration or discussion: 

      • The current gold standard for validating Alphafold predictions involves making targeted mutants suggested by the structural predictions. The absence of any such validation weakens the conclusions significantly. 

      We agree that the targeted mutagenesis based on AlphaFold2-predicted interaction interfaces represents a powerful approach to experimentally validate the in silico models. While we did not pursue structure-guided mutagenesis in this study, our goal was to identify putative interactions between septin and exocyst subunits as a foundation for future functional work. Our current conclusions are intentionally limited to proposing putative interfaces, supported by co-immunoprecipitation and genetic interaction data.

      We recognize that direct validation of specific contact residues would significantly strengthen the model. Accordingly, we have revised the Discussion to explicitly state this limitation and to note that structure-based mutagenesis will be an important next step to test the functional relevance of predicted interactions. We have added the following statement:

      “Future studies are needed to refine the residues involved in the interactions because the predicted interacting residues from AlphaFold are too numerous. However, it is encouraging that most of the predicted interacting residues are clustered in several surface patches. Experimental validation through targeted mutagenesis is an important next step.”

      • Much of the writing appears to imply that differences in mutant phenotypes indicate differences in septin (or exocyst) subunit behaviors/functions. However, my reading of the work in budding yeast is that such differences reflect the partial functionality that can be conferred by aberrant partial septin complexes that assemble and may polymerize in mutants lacking different subunits. In this view, which is supported by data showing that essentially all septins are in stoichiometric octameric complexes in cells, the wild-type functions are all mediated by the full complex. Similarly, the separate exocyst subunit localizations based on tagged Sec3 (Finger et al) were not supported by later work from the Brennwald lab with untagged Sec3, and the idea that different exocyst subunits may function separately from the full complex has very limited support in yeast. I would suggest that the text be edited to better reflect the literature, or that different views be better justified. 

      Thanks for the suggestions. We have revised the text accordingly.

      • The comprehensive set of Alphafold2 predictions is a major strength of the paper, but it is unclear to this reader whether the multiple predicted interactions truly reflect multivalent multimode interactions or whether many (most?) predictions would not be consistent with interactions between full complexes and may not indicate physiological interactions. Better discussion of these issues is needed to interpret the findings. 

      We appreciate the reviewer’s suggestion to use structural prediction to further assess interaction plausibility. We have now employed the full Saccharomyces cerevisiae exocyst complex (with 4.4 Å resolution) published by the Guo group to examine the interfaces of septins and the exocyst interactions, assuming that the S. pombe exocyst has the similar structure. We mapped predicted contact residues onto the predicted structure. Most predicted interfaces (86% for the exocyst and 86-96% for septins) appear to be located on accessible surfaces in the assembled complexes (Figure supplement S4, S5, videos 4 - video 7), suggesting that these interactions are sterically plausible. We have added this important caveat to the text of the revised manuscript highlighting the interface accessibility within the assembled complexes. We appreciate the reviewer’s insight, which helped us strengthen the interpretation and limitations of the AlphaFold-based analysis.

      • Some but not all co-IP blots appear to show tubulin (negative control) coming down with the GFP pull-downs. Why is that, and what does it imply for the reliability of the co-IP protocol? 

      The presence of tubulin in some immunoprecipitates is not unexpected, particularly in experiments involving cytoskeleton-associated proteins such as septins and exocyst subunits. The occasional presence of tubulin in our co-IP samples is consistent with welldocumented reports showing tubulin as a frequent non-specific co-purifying protein, particularly under native lysis conditions used to preserve large complexes (Vega and Hsu, 2003; Gavin et al., 2006; Mellacheruvu et al., 2013; Hein et al., 2015). The CRAPome database and quantitative interactomics studies highlight tubulin as one of the most common background proteins in affinity-based workflows. Importantly, tubulin was used as a loading control but not a marker for interaction in our study, and its variable presence does not reflect a specific interaction with Sec15-GFP or other bait proteins, and we have clarified this point in the revised figure legend.

      Gavin, A.C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dümpelfeld, B. and Edelmann, A., 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084), pp.631-636.

      Mellacheruvu, D., Wright, Z., Couzens, A.L., Lambert, J.P., St-Denis, N.A., Li, T., Miteva, Y.V., Hauri, S., Sardiu, M.E., Low, T.Y. and Halim, V.A., 2013. The CRAPome: a contaminant repository for affinity purification–mass spectrometry data. Nature methods, 10(8), pp.730736.

      Hein, M.Y., Hubner, N.C., Poser, I., Cox, J., Nagaraj, N., Toyoda, Y., Gak, I.A., Weisswange, I., Mansfeld, J., Buchholz, F. and Hyman, A.A., 2015. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell, 163(3), pp.712-723.

      Vega, I.E., Hsu, S.C. 2003. The septin protein Nedd5 associates with both the exocyst complex and microtubules and disruption of its GTPase activity promotes aberrant neurite sprouting in PC12 cells. Neuroreport, 14, pp.31-37.

      • Why were two different protocols used for different yeast-two-hybrid analyses? 

      The purpose of using two protocols was to test which protocol is more reliable and sensitive.

      • The different genetic interactions between septin and exocyst mutants when combined with TRAPP-II mutants merits further discussion: might the difference reflect relocation of exocyst from rim to center in septin mutants versus inactivation of exocyst in exocyst mutants? 

      We appreciate this insightful comment and agree that this distinction is likely meaningful. The reviewer correctly notes that septin mutants may not abolish exocyst function but rather cause its spatial mislocalization: from the rim to the center of the division site, whereas the exocyst mutants likely result in partial or complete loss of vesicle tethering activity at the plasma membrane.

      To address this important nuance, we have expanded the Discussion as follows:

      “The genetic interactions between mutations in the exocyst and septins when combined with TRAPP-II mutants may reflect fundamentally different consequences for compromising the exocyst function (Tables 1 and 2). In septin mutants, the exocyst complex still localizes to the division site but is mispositioned from the rim to the center of the division plane. This mislocalization allows partial retention of exocyst function, leading to very mild synthetic or additive defects when combined with compromised TRAPP-II trafficking and tethering. In contrast, in exocyst subunit mutants, the exocyst becomes partial or non-functional, resulting in a more severe loss of exocyst activity. These differing consequences could explain the qualitative differences in genetic interactions observed with TRAPP-II mutants (Tables 1 and 2). Thus, septins and the exocyst also work in different genetic pathways for certain functions in fission yeast cytokinesis.”

      • The vesicle accumulation in septin mutants was quite modest. Does that imply that most vesicles are still fusing in the septum? Further discussion would be beneficial to understand what the authors think this means. 

      We thank the reviewer for this important point. We agree that the modest vesicle accumulation observed in septin mutants suggests that a significant proportion of vesicles continue to successfully fuse at the division site, even in the absence of fully functional septin structures.

      We now discuss this in greater detail in the revised manuscript:

      “The relatively modest vesicle accumulation in septin mutants suggests that septins are not absolutely required for vesicle tethering or fusion per se at the division site. Instead, septins primarily function to spatially organize the targeting sites of exocyst-directed vesicles by stabilizing the localization of the exocyst at the rim of the cleavage furrow. In septin mutants, mislocalization of the exocyst reduces the spatial precision of membrane insertion but still permits vesicle tethering and fusion, albeit in a less controlled manner. Thus, septins likely play a modulatory rather than essential role in exocytic vesicle delivery during cytokinesis. This interpretation aligns with our localization and genetic interaction data, which indicates that septins act as scaffolds to optimize secretion geometry, rather than as core components of the fusion machinery.”

      • It was unclear to this reader why relocation of some exocyst complexes from the rim to the center of the septal region would lead to dramatic thickening of the septum. Further discussion would be beneficial to understand what the authors think this means. 

      The modest accumulation of vesicles and vesicle cargos at the division site is one of the reasons for the increased thickness of the division septum in septin mutants. It is more likely that the misplaced exocyst can still tether vesicles along the division plane without septins. Because of the lack of glucanase Eng1 at the rim of the division plane in septin mutants, daughter-cell separation is delayed and then cells continue to thicken the septum. We have added these points to the Discussion.

    1. Reviewer #3 (Public review):

      Summary:

      Borghi and colleagues present results from 4 experiments aimed at investigating the effects of dual γtACS and iTBS stimulation of the precuneus on behavioral and neural markers of memory formation. In their first experiment (n = 20), they find that a 3-minute offline (i.e., prior to task completion) stimulation that combines both techniques leads to superior memory recall performance in an associative memory task immediately after learning associations between pictures of faces, names, and occupation, as well as after a 15-minute delay, compared to iTBS alone (+ tACS sham) or no stimulation (sham for both iTBS and tACS). Performance in a second task probing short-term memory was unaffected by the stimulation condition. In a second experiment (n = 10), they show that these effects persist over 24 hours and up to a full week after initial stimulation. A third (n = 14) and fourth (n = 16) experiment were conducted to investigate neural effects of the stimulation protocol. The authors report that, once again, only combined iTBS and γtACS increases gamma oscillatory activity and neural excitability (as measured by concurrent TMS-EEG) specific to the stimulated area at the precuneus compared to a control region, as well as precuneus-hippocampus functional connectivity (measured by resting state MRI), which seemed to be associated with structural white matter integrity of the bilateral middle longitudinal fasciculus (measured by DTI).

      Strengths:

      Combining non-invasive brain stimulation techniques is a novel, potentially very powerful method to maximize the effects of these kinds of interventions that are usually well-tolerated and thus accepted by patients and healthy participants. It is also very impressive that the stimulation-induced improvements in memory performance resulted from a short (3 min) intervention protocol. If the effects reported here turn out to be as clinically meaningful and generalizable across populations as implied, this approach could represent a promising avenue for treatment of impaired memory functions in many conditions.

      Methodologically, this study is expertly done! I don't see any serious issues with the technical setup in any of the experiments. It is also very commendable that the authors conceptually replicated the behavioral effects of experiment 1 in experiment 2 and then conducted two additional experiments to probe the neural mechanisms associated with these effects. This certainly increases the value of the study and the confidence in the results considerably.

      The authors used a within-subject approach in their experiments, which increases statistical power and allows for stronger inferences about the tested effects. They also used to individualize stimulation locations and intensities, which should further optimize the signal-to-noise ratio.

    2. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      The authors make a bold claim that a combination of repetitive transcranial magnetic stimulation (intermittent theta burst-iTBS) and transcranial alternating current stimulation (gamma tACS) causes slight improvements in memory in a face/name/profession task.

      Strengths:

      The idea of stimulating the human brain non-invasively is very attractive because, if it worked, it could lead to a host of interesting applications. The current study aims to evaluate one such exciting application.

      Weaknesses:

      (1) The title refers to the "precuneus-hippocampus" network. A clear definition of what is meant by this terminology is lacking. More importantly, mechanistic evidence that the precuneus and the hippocampus are involved in the potential effects of stimulation remains unconvincing.

      Thank you for the observation. We believe that the evidence collected supports our state relative to the stimulation of the precuneus and the involvement of the hippocampus. In particular, given the existing evidence on TMS methodology and precuneus non-invasive stimulation (see Koch et al., Brain, 2022, Koch et al., Alzheimer's research & therapy, 2025), the computation of the biophysical model with the E-field we produced (see Biophysical modeling and E-field calculation section in the supplementary information), together with the individual identification of the precuneus through the RM (see iTBS+γtACS neuromodulation protocol and MRI data acquisition in the main text), we can reasonably assume that the individually identified PC was stimulated.

      As we acknowledged in the Limitations section, we cannot entirely rule out the possibility that our results might also reflect stimulation of more superficial parietal regions adjacent to the precuneus. Nor do we provide direct evidence of microscopic changes in the precuneus following stimulation. However, the results we provide in terms of changes in precuneus oscillatory activity and precuneus-hippocampi connectivity sustain both our thesis of the precuneus stimulation and of hippocampi involvement in the stimulation effects.

      Despite this consideration, we agree on the fact that a clear definition of what is meant by the terminology “precuneus-hippocampus network” is lacking. Moreover, since our data and previous evidence sustain the notion of PC stimulation, while this study does not produce direct evidence of the hippocampi stimulation - but only of the effect of the neuromodulation protocol on its connection with the precuneus, we soften the claim in the title. We remove the mention of the precuneus-hippocampus network so that the modified title will be as follows: “Dual transcranial electromagnetic stimulation of the precuneus boosts human long-term memory.”

      (2) The question of the extent to which the stimulation approach and the stimulation parameters used in these experiments causes specific and functionally relevant neural effects remains open. Invasive recordings that could address this question remain out of the scope of this non-invasive study. The authors conducted scalp EEG experiments in an attempt to address this question using non-invasive methods. However, the results shown in Fig. 3 are unclear. The results are inconsistently reported in units of microvolts squared in some panels (3A, 3B) and in units of microvolts in other panels (3C). Also, there is insufficient consideration of potential contamination by signal components reflecting eye movements, other muscle artifacts, or another volume-conducted signal reflecting aggregate activity inside the brain.

      As you correctly noted, Figure 3 presents results obtained from the TMS–EEG recordings. However, there is no inconsistency regarding the measurement units, as we are referring to two distinct indices: one in the frequency domain—oscillatory power shown in Figures 3A and 3B, expressed in microvolts squared (μV<sup>²</sup>)—and one in the time domain—the TMS-evoked potential shown in Figure 3C, expressed in microvolts (μV).

      Regarding the concern about artifacts, this is an important issue on which our group has a strong expertise, having published well-established, highly cited procedures on how to record and clean TMS-EEG signals (e.g., Casula et al., Clinical Neurophysiology, 2017; Rocchi et al., Brain Stimulation, 2021). In the current study, we adopted a well-established and rigorous approach for both data acquisition and preprocessing. This ensured that the recorded TMS–EEG signals were not contaminated by physiological or electrical artifacts.

      As regards the recording procedure, all participants were instructed to fixate on a black cross to minimize eye movements. To avoid auditory-related components caused by the TMS click, we adopted an ad-hoc procedure optimized for TMS-EEG recordings (Rocchi et al., Brain Stimulation, 2021). First, participants were given earphones that continuously played an ad-hoc masking noise composed of white noise mixed with specific time-varying frequencies of the TMS click (Rocchi et al., Brain Stimulation, 2021). The masking noise volume was adjusted to ensure that participants could not detect the TMS click, or as much as tolerated (always below 90 dB). To further reduce the impact of the TMS click on the EEG signal, we placed ear defenders (SNR=30) on top of the earphones. Please see TMS–EEG data acquisition section in the main text.

      As regards the offline cleaning process, we applied Independent Component Analysis (INFOMAX-ICA) to the EEG data to identify and remove components associated with muscle activity, eye movements, blinking, and residual TMS-related artifacts, in line with the most recent guidelines on TMS–EEG preprocessing (Hernandez-Pavon et al., Brain Stimulation, 2023). Specifically, for TMS-related muscle artefacts, we strictly followed the criteria based on their scalp topography, spectral content, timing, and amplitude, which we published in a paper focused on this topic (Casula et al., Clinical Neurophysiology, 2017). We add this detail in the TMS–EEG preprocessing and analysis section in the supplementary information (lines 119-120).

      (3) Figure 3 indicates "Precuneus oscillatory activity ...", but evidence that the activity presented reflects precuneus activity is lacking. The maps shown at the bottom of Figure 3C suggest that the EEG signals recorded with scalp EEG reflect activity generated across a wide spatial range, with a peak encompassing at least tens of centimeters. Thus, evidence that effects specifically reflect precuneus activity, as the paper's title and text throughout the manuscript suggest, is lacking.

      We believe there may have been a misunderstanding. As indicated in the figure caption, panels A and B represent oscillatory activity, whereas panel C displays the TMS-evoked potentials (TEPs). Therefore, the topographical maps mentioned (i.e., those in panel C) did not refer to oscillatory activity, but to differences in TEP amplitude. Specifically, the topographies shown in Figure 3C illustrate statistically significant differences in TEP amplitudes between post-stimulation time points (T1—immediately after stimulation, and T2—20 minutes after stimulation) and the pre-stimulation baseline (T0).

      In this figure, we focused our analysis on a cluster of electrodes overlying the individually identified precuneus, capturing EEG responses to single TMS pulses delivered to that target. This approach, widely used in previous literature (e.g., Koch et al., NeuroImage, 2018; Casula et al., Annals of Neurology, 2022; Koch et al., Brain, 2022; Maiella et al., Clinical Neurophysiology, 2024; Koch et al., Alzheimer’s Research & Therapy, 2025), supports the interpretation that the observed responses reflect precuneus-related activity. Furthermore, the wide spatial range change you mention proved to be statistically different only when conducting the TMS-EEG over the precuneus (i.e., administering the TMS single pulse over the precuneus) and not when performing it over the left parietal cortex. We modified the discussion section in the main text to make it more clear (lines 196-199).

      “Moreover, we observed specific cortical changes in the posteromedial parietal areas, as evidenced by the whole-brain analysis conducted on TMS-EEG data when performed over the precuneus and the absence of effect when TMS-EEG was performed on the lateral posterior parietal cortex used as a control condition.”

      That said, we do not state that the effects observed specifically reflect the precuneus activity; indeed, we think the effect of the stimulation is broader, as discussed in the Discussion section. We rather sustain, in line with the literature (Koch et al., Neuroimage 2018; Koch et al., Brain, 2022; Koch et al., Alzheimer's research & therapy, 2025), the idea that the effects observed are a consequence of the precuneus stimulation by the dual stimulation.

      (4) The paper as currently presented (e.g., Figure 3) also lacks rigorous evidence of relevant oscillatory activity. Prior to filtering EEG signals in a particular frequency band, clear evidence of oscillations in the frequency band of interest should be shown (e.g., demonstration of a clear peak that emerges naturally in the frequency range of interest when spectral analysis is applied to "raw" signals). The authors claim that gamma oscillations change because of the stimulation, but a clear peak in the gamma range prior to stimulation is not apparent in the data as currently presented. Thus, the extent to which spectral measurements during stimulation reflect physiological gamma oscillations remains unclear.

      If we understand correctly, your concern relates to the lack of a clear gamma peak before neuromodulation, which may suggest uncertainty about the observed changes in gamma oscillatory activity. Is that correct?

      First, it is important to underline that the natural frequency typically observed in the precuneus falls within the beta range, not the gamma range (see Rosanova et al., Journal of Neuroscience, 2009; Casula et al., Annals of Neurology, 2022). This explains why a prominent gamma peak is not expected at baseline (T0).

      Differently, our neuromodulatory protocol was specifically aimed at boosting gamma oscillatory activity given its well-established role in learning and memory processes (Griffiths & Jensen, Trends in Neurosciences, 2023). Thus, to assess the effect of the neuromodulatory protocol, we compared the oscillatory activity before (T0) and after stimulation (T1 and T2), which showed a clear increase in the gamma band. This effect is visible in the raw oscillatory power plot and is most clearly represented in Figure 3B, where the gamma band emerged as the only frequency range showing significant changes across time points.

      (5) Concerns remain regarding the rigor of statistical analyses in the revised manuscript (see also point 8 below). Figure 3B shows an undefined statistical test with p<0.05. The statistical test that was used is not explained. Also, a description of how corrections for multiple comparisons were made is missing. Figures 3A and 3C are not accompanied by statistics, making the results difficult to interpret. For Figure 4C, a claim was made based on a significant p-value for one statistical test and a non-significant p-value in another test. This is a common statistical mistake (see Figure 1 and accompanying discussion in Makin and Orban de Xivry (2019) Science Forum: Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. eLife 8:e48175).

      All statistical tests are described in the Statistical Analysis section of the main text. Specifically, to assess cortical oscillation changes in Experiment 3, we conducted repeated-measures ANOVAs with stimulation condition (iTBS+γtACS vs. iTBS+sham-tACS) and time (ΔT1 = T1–T0; ΔT2 = T2–T0) as within-subject factors, for each frequency band. To further explore the effects of stimulation at each time point, we performed paired t-tests with Bonferroni correction for multiple comparisons. A one-tailed hypothesis was adopted, based on our a priori prediction of gamma-band increase derived from previous work (Maiella et al., 2022).

      Please note that Figures 3A and 3C are purely descriptive and are therefore not accompanied by statistical tests. Figure 3A shows the full spectral profile across frequencies and conditions, while statistical significance for these data is reported in Figure 3B. Similarly, the upper part of Figure 3C displays the TMS-evoked potential (TEP) in the precuneus, while the statistical comparison of TEP amplitudes across time points is shown in the lower part of Figure 3C.

      Regarding Figure 4C and the article you cited, are you referring to the error described as “Interpreting comparisons between two effects without directly comparing them”? If we understand correctly, this refers to the mistake of inferring an effect by observing that a significant result occurs in one condition or group, while the corresponding result in another condition or group is not significant, without directly testing the difference between them.

      In the case of Experiment 4, which investigates fMRI effects and is illustrated in Figure 4, we employed a general linear model that explicitly modeled both conditions and time points, allowing for a direct statistical comparison. Therefore, the connectivity effect reported does not fall into the category of the error you mentioned.

      Importantly, Figure 4C does not depict the effect of the neuromodulatory protocol itself. Rather, its purpose is to show that, within the real stimulation condition, there is a correlation between the observed effect and the integrity of the bilateral Middle Longitudinal Fasciculus. No conclusions or assumptions were made based on the absence of a significant correlation in the sham condition. However, since it was an exploratory analysis, we decided to soften our claims relative to the neural mechanism in the discussion section of the main text (lines 241-246).

      (6) In the second question posed in the original review, I highlighted that it was unclear how such stimulation would produce memory enhancement. The authors replied that, in the absence of mechanisms, there are many other studies that suffer from the same problem. This raises the question of placebo effects. The paper does not sufficiently address or discuss the possibility that any potential stimulation effects may reflect placebo effects.

      We agree with the reviewer on the potential role of a placebo effect in our study. For this reason, our experimental study had several stimulation conditions, including a placebo condition, which corresponded to the sham iTBS-sham tACS condition, which did not produce any effect.

      (7) The third major concern in the original review was the lack of evidence for a mechanism that is specific to the precuneus. Evidence for specific involvement of the precuneus remains lacking in the revised manuscript. The authors state: "the non-invasive stimulation protocol was applied to an individually identified precuneus for each participant". However, the meaning of this statement is unclear. Specifically, it is unclear how the authors know that they are specifically targeting the precuneus. Without directly recording from the precuneus and directly demonstrating effects, which is outside of the scope of the study, specific involvement of the precuneus seems speculative. Also, it does not seem as though a figure was included in the paper to show how the stimulation protocol specifically targets the precuneus. In their response to the original reviews, the authors state that posterior medial parietal areas are the only regions that show significant differences following the stimulation, but they did not cite a specific figure, or statistics reported in the text, that show this. In any event, posterior medial parietal areas encompass a wide area of the brain, so this would still not provide evidence for an effect specifically involving the precuneus.

      We respectfully disagree with the claim that targeting the precuneus in our study is speculative. The statement that “without directly recording from the precuneus and directly demonstrating effects, which is outside the scope of the study, specific involvement of the precuneus seems speculative” would, by that logic, implicitly call into question a large body of cognitive neuroscience research employing non-invasive techniques such as EEG and fMRI.

      Our methodological approach—combining MRI-guided stimulation, biophysical modeling, and TMS–EEG—is well established and widely used for targeting and studying the role of specific cortical regions, including the precuneus (e.g., Wang et al., Science, 2014; Koch et al., NeuroImage, 2018; Casula et al., Annals of Neurology, 2022, 2023; Koch et al., Brain, 2022; Maiella et al., Clinical Neurophysiology, 2024; Koch et al., Alzheimer’s Research & Therapy, 2025).

      In line with previously published protocols (Santarnecchi et al., Human Brain Mapping, 2018; Özdemir et al., PNAS, 2020; Mantovani et al., Journal of Psychiatric Research, 2021), we identified individual targets (i.e., the precuneus) for each participant based on structural and resting-state functional MRI data (see MRI Data Acquisition and Preprocessing section in the main text). This target was then accurately localized using MRI-guided stereotaxic neuronavigation, ensuring reproducible and anatomically precise stimulation across subjects.

      Finally, concerning the last comment about the lack of figures/statistics showing how the stimulation protocol targets the precuneus and the specificity of the effect observed, we would like to let the focus go over:

      Figure 3 in the main text, where we show the results of the TME-EEG over the posterior medial parietal areas;

      Figure S1 in the supplementary information, which shows with the e-fied simulation how the stimulation protocol targets the brain;

      the Precuneus iTBS+γtACS increases gamma oscillatory activity section in the main text results, where we report the results of the statistical analysis of the TMS-EEG conducted over the precuneus and the left posterior parietal cortex, used as a control condition to test for the specificity of the neuromodulation protocol.

      (8) Regarding chance levels, it is unfortunate that the authors cannot quantify what chance levels are in the immediate and delayed recall conditions. This makes interpretation of the results challenging. In the immediate and delayed conditions, the authors state that the chance level is 33%. It would be useful to mark this in the figures. If I understand correctly, chance is 33% in Fig. 2A. If this is the case and if I am interpreting the figure correctly:

      Gray bars for the sham condition appear to be below chance (~20-25%). Why is this condition associated with an accuracy level that is lower than chance?

      Cyan bars and red bars do not appear to be significantly different from chance (i.e., 33%), with red slightly higher than cyan. What statistic was performed to obtain the level of significance indicated in the figure? The highest average value for the red condition appears to be around 35%. More details are needed to fully explain this figure and to support the claims associated with this figure.

      The immediate and recall conditions you mention correspond to a free recall task. In this case, the notion of a fixed "chance level" is not straightforward as it would be in recognition or forced-choice paradigms, which is why we did not quantify it at first. I will now try to explain this extensively.

      Unlike multiple-choice tasks, where participants select the answer from a limited set of alternatives and the probability of a correct response by chance can be precisely quantified (e.g., 33% in a 3-alternative forced choice), free recall involves the spontaneous retrieval of items from memory without external cues or predefined options. As such, the response range in free recall is essentially unconstrained, encompassing the entire vocabulary of the participant.

      Because of this open-ended nature, the probability of correctly recalling a studied item purely by chance is exceedingly low and could be approximated to zero. Also, in our task, participants had to correctly recollect both name and occupation, doubling the possibility of the answers.

      This assumption is further supported by the fact that random guesses in free recall are unlikely to match any of the studied items, given the vast number of possible alternatives. As a result, performance above zero can be reasonably interpreted as reflecting genuine memory retrieval, rather than random guessing.

      As regards statistics, repeated-measures ANOVAs with stimulation condition as a within-subject factor (i.e., iTBS+γtACS; iTBS+sham-tACS; sham-iTBS+sham-tACS) for each dependent variable (see statistical analysis section in main text).

      (9) In the revised version of the paper, the authors did not address concerns associated with the block design (please see question 4d in the original review).

      We are sorry for the misunderstanding. We did not address your concerns related to block design since it does not apply to our study. As reported in the paper you mentioned in the original review, block design involves data collection performed in response to different stimuli of a given class presented in succession. If this is the case, it does not correspond to our experimental design since both TMS-EEG and fMRI were conducted in the resting state (i.e., without the presentation of stimuli) on different days according to the different randomized stimulation conditions.  

      In sum, this study presents an admirable aspirational goal, the notion that a non-invasive stimulation protocol could modulate activity in specific brain regions to enhance memory. However, the evidence presented at the behavioral level and at the mechanistic level (e.g. the putative involvement of specific brain regions) remains unconvincing.

      We hope our response will be carefully considered, fostering a constructive exchange and leading to a reassessment of your evaluation.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Borghi and colleagues provides evidence that the combination of intermittent theta burst TMS stimulation and gamma transcranial alternating current stimulation (γtACS) targeting the precuneus increases long-term associative memory in healthy subjects compared to iTBS alone and sham conditions. Using a rich dataset of TMS-EEG and resting-state functional connectivity (rs-FC) maps and structural MRI data, the authors also provide evidence that dual stimulation increased gamma oscillations and functional connectivity between the precuneus and hippocampus. Enhanced memory performance was linked to increased gamma oscillatory activity and connectivity through white matter tracts.

      Strengths:

      The combination of personalized repetitive TMS (iTBS) and gamma tACS is a novel approach to targeting the precuneus, and thereby, connected memory-related regions to enhance long-term associative memory. The authors leverage an existing neural mechanism engaged in memory binding, theta-gamma coupling, by applying TMS at theta burst patterns and tACS at gamma frequencies to enhance gamma oscillations. The authors conducted a thorough study that suggests that simultaneous iTBS and gamma tACS could be a powerful approach for enhancing long-term associative memory. The paper was well-written, clear, and concise.

      Comments on Revision:

      I thank the authors for their thoughtful responses to my first review and their inclusion of more detailed methodological discussion of their rationale for the stimulation protocol conditions and timing. Regarding the apparent difference in connectivity at baseline between conditions, the explanation that this is due to intrinsic dynamics, state, or noise implies the baseline is reflecting transient changes in dynamics rather than a true or stable baseline. Based on this, it looks like iTBS solely is significantly greater than the baseline before the iTBS and γtACS condition but maybe not that much lower than post-stimulation period for iTBS and γtACS. A longer baseline period should be used to ensure transient states are not driving baseline levels such that these endogenous fluctuations would average out. This also raises questions about whether the effect of iTBS and γtACS or iTBS alone are dependent on the intrinsic state at the time when stimulation begins. Their additional clarification of memory scoring is helpful but also reveals that the effect of dual iTBS+γtACS specifically on the association between faces and names is just significant. This modest increase in associative memory should be taken into consideration when interpreting these findings.

      We thank the reviewer for the feedback. We fully agree that considering baseline dynamics is critical when assessing the neurophysiological and connectivity effects of stimulation protocols.

      In Experiments 3 and 4, baseline measurements were specifically included in our design to account for the possibility that intrinsic dynamics, state, or noise could influence the observed effects of neuromodulation. Indeed, if we had compared only post-stimulation connectivity between the real and sham conditions, the effects might have appeared larger. The inclusion of baseline measurements allows us to contextualize and better isolate the neuromodulatory impact by controlling such endogenous fluctuations. Importantly, the fMRI connectivity measurements, which comprise the baseline, are derived from 10-minute BOLD signal acquisitions, which help mitigate the influence of transient fluctuations and provide a quite stable estimate of intrinsic connectivity.

      Moreover, regarding the possibility that stimulation effects may depend on the intrinsic state at stimulation onset, we hypothesize that gamma-frequency entrainment induced by tACS could reduce the variability of intrinsic dynamics, promoting a more stable neural state that is favorable for the induction of long-term plasticity.

      As regards the memory scoring, we would like to clarify that the significant improvement observed in the dual iTBS+γtACS condition does not pertain solely to the face–name association. Rather, it concerns the more demanding task of recalling the association between face, name, and occupation. While we agree that the observed effect could be considered modest, it is worth noting that it follows from only 3 minutes of stimulation.

      Reviewer #3 (Public review):

      Summary:

      Borghi and colleagues present results from 4 experiments aimed at investigating the effects of dual γtACS and iTBS stimulation of the precuneus on behavioral and neural markers of memory formation. In their first experiment (n = 20), they find that a 3-minute offline (i.e., prior to task completion) stimulation that combines both techniques leads to superior memory recall performance in an associative memory task immediately after learning associations between pictures of faces, names, and occupation, as well as after a 15-minute delay, compared to iTBS alone (+ tACS sham) or no stimulation (sham for both iTBS and tACS). Performance in a second task probing short-term memory was unaffected by the stimulation condition. In a second experiment (n = 10), they show that these effects persist over 24 hours and up to a full week after initial stimulation. A third (n = 14) and fourth (n = 16) experiment were conducted to investigate neural effects of the stimulation protocol. The authors report that, once again, only combined iTBS and γtACS increases gamma oscillatory activity and neural excitability (as measured by concurrent TMS-EEG) specific to the stimulated area at the precuneus compared to a control region, as well as precuneus-hippocampus functional connectivity (measured by resting state MRI), which seemed to be associated with structural white matter integrity of the bilateral middle longitudinal fasciculus (measured by DTI).

      Strengths:

      Combining non-invasive brain stimulation techniques is a novel, potentially very powerful method to maximize the effects of these kinds of interventions that are usually well-tolerated and thus accepted by patients and healthy participants. It is also very impressive that the stimulation-induced improvements in memory performance resulted from a short (3 min) intervention protocol. If the effects reported here turn out to be as clinically meaningful and generalizable across populations as implied, this approach could represent a promising avenue for treatment of impaired memory functions in many conditions.

      Methodologically, this study is expertly done! I don't see any serious issues with the technical setup in any of the experiments. It is also very commendable that the authors conceptually replicated the behavioral effects of experiment 1 in experiment 2 and then conducted two additional experiments to probe the neural mechanisms associated with these effects. This certainly increases the value of the study and the confidence in the results considerably.

      The authors used a within-subject approach in their experiments, which increases statistical power and allows for stronger inferences about the tested effects. They also used to individualize stimulation locations and intensities, which should further optimize the signal-to-noise ratio.

      Weaknesses:

      I think one of the major weaknesses of this study is the overall low sample size in all of the experiments (between n = 10 and n = 20). This is, as I mentioned when discussing the strengths of the study, partly mitigated by the within-subject design and individualized stimulation parameters. The authors mention that they performed a power analysis but this analysis seemed to be based on electrophysiological readouts similar to those obtained in experiment 3. It is thus unclear whether the other experiments were sufficiently powered to reliably detect the behavioral effects of interest. In the revised manuscript, the authors provide post-hoc sensitivity analyses that help contextualize the strength of the findings.

      While the authors went to great lengths trying to probe the neural changes likely associated with the memory improvement after stimulation, it is impossible from their data to causally relate the findings from experiments 3 and 4 to the behavioral effects in experiments 1 and 2. This is acknowledged by the authors and there are good methodological reasons for why TMS-EEG and fMRI had to be collected in separate experiments, but readers should keep in mind that this limits inferences about how exactly dual iTBS and γtACS of the precuneus modulate learning and memory.

      We thank the reviewer for the feedback.

      Reviewer #1 (Recommendations for the authors):

      I suggest:

      (1) Removing all mechanistic claims about the precuneus and hippocampus.

      We soften our claims about the precuneus-hippocampus network.

      (2) Repeating and focusing on the behavioral experiments with a much larger number of images and stronger statistical power to try to demonstrate a compelling behavioral correlate of the proposed stimulation protocol.

      We clarified the misunderstanding relative to the chance level of the behavioral experiments raised by the reviewer.

      Reviewer #2 (Recommendations for the authors):

      Use longer baseline to establish stable gamma level for comparisons in Figure 3

      If we understand correctly, you propose to increase the baseline to establish the gamma oscillatory activity as expressed in Figure 3 (showing the results of experiment 3). Is that right? In the figure, you see a baseline of -100; 0ms, which we use for a merely graphical reason, since no activity is usually observable before the TMS pulse. However, to establish the level of gamma, we used a larger baseline correction ranging from -700 ms to -300 ms (i.e., 400ms). We added this important information in the cortical oscillation section of the supplementary information (lines 134-135).

      Reviewer #3 (Recommendations for the authors):

      I think that the authors did a great job responding to the concerns raised by the reviewers. All of my own comments have been satisfactorily addressed. I will update my public review to be more concise, so that it only includes the overall assessment of the manuscript, including the strengths and weaknesses, but without the requests for clarification. Strengths and weaknesses remain largely the same, as the authors did not conduct additional experiments.

      Thank you.

    1. Reviewer #3 (Public review):

      Summary:

      In the manuscript by Shen, Yeung, and colleagues, the authors generate an improved and expanded Mosaic analysis by gRNA-induced crossing-over (MAGIC) toolkit for use in making mosaic clones in Drosophila. This is a clever method by which mitotic clones can be induced in dividing cells by using CRISPR/Cas9 to generate double-strand breaks at specific locations that induce crossing over at those locations. This is conceptually similar to previous mosaic methods in flies that utilized FRT sites that had been inserted near centromeres along with heat-shock inducible FLPase. The advantage of the MAGIC system is that it can be used along with chromosomes lacking FRT sites already introduced, such as those found in many deficiency collections or in EMS mutant lines. It may also be simpler to implement than FRT-based mosaic systems. There are two flavors of the MAGIC system: nMAGIC and pMAGIC. In nMAGIC, the main constituents are a transgene insertion that contains gRNAs that target DNA near the centromere, along with a fluorescent marker. In pMAGIC, the main constituents are a transgenic insertion that contains gRNAs that target DNA near the centromere, along with ubiquitous expression of GAL80. As such, nMAGIC can be used to generate clones that are not labelled, whereas pMAGIC (along with a GAL4 line and UAS-marker) can be used much like MARCM to positively label a clone of cells. This manuscript introduces MAGIC transgenic reagents that allow all 4 chromosomes to be targeted. They demonstrate its use in a variety of tissues, including with mutants not compatible with current FLP/FRT methods, and also show it works well in tissues that prove challenging for FLP/FRT mosaic analyses (such as motor neurons). They further demonstrate that it can be used to generate mosaic clones in non-melanogaster hybrid tissues. Overall, this work represents a valuable improvement to the MAGIC method that should promote even more widespread adoption of this powerful genetic technique.

      Strengths:

      (1) Improves the design of the gRNA-marker by updating the gRNA backbone and also the markers used. GAL80 now includes a DE region that reduces the perdurance of the protein and thus better labeling of pMAGIC clones. The data presented to demonstrate these improvements is rigorous and of high quality.

      (2) Introduces a toolkit that now covers all chromosome arms in Drosophila. In addition, the efficiency of 3 target different sites is characterized for each chromosome arm (e.g., 3 different gRNA-Marker combinations), which demonstrate differences in efficiency. This could be useful to titrate how many clones an experimenter might want (e.g., lower efficiency combinations might prove advantageous).

      (3) The manuscript is well written and easy to follow. The authors achieved their aims of creating and demonstrating MAGIC reagents suitable for mosaic analysis of any Drosophila chromosome arm.

      (4) The MAGIC method is a valuable addition to the Drosophila genetics toolkit, and the new reagents described in this manuscript should allow it to become more widely adopted.

      Weaknesses:

      (1) The MAGIC method might not be well known to most readers, and the manuscript could have benefited from schematics introducing the technique.

      (2) Traditional mosaic analyses using the FLP/FRT system have strongly utilized heat-shock FLPase for inducible temporal control over mitotic clones, as well as a way to titrate how many clones are induced (e.g., shorter heat shocks will induce fewer clones). This has proven highly valuable, especially for developmental studies. A heat-shock Cas9 is available, and it would have been beneficial to determine the efficiency of inducing MAGIC clones using this Cas9 source.

    2. Author response:

      Reviewing Editor Comments:

      The following are some consolidated review remarks after discussions amongst all three reviewers:

      The reviewers feel the evidence level could be raised from 'convincing' to 'compelling' if the following key (and partially shared) suggestions by the reviewers are followed adequately:

      (1) Expand labeling options for nMAGIC, which is currently just a BFP marker. This would increase the utility of the method. A far-red marker would be very helpful. Could the authors just do this for one chromosome arm and make the reagent available for others to generate other chromosome arms?

      This is a great suggestion. We will make an nMAGIC vector containing a far-red fluorescent marker and generate a 40D2 version of this nMAGIC gRNA-maker to demonstrate its utility. This vector will be available for others to make additional nMAGIC gRNA-markers.

      (2) Verify that destabilized GAL80 is potent enough to suppress GAL4. Repeat Figure 1C-E with tub-GAL80-DE-SV40.

      We will use a tub-GAL80-DE-SV40 gRNA-marker to test suppression of pxn-Gal4.

      (3) Concern about the health of the induced mitotic clones. This is an important consideration, but the reviewers were not sure what the necessary experiments would be. To gauge twin-spot clone sizes? Please address.

      We will assess the health of induced mitotic clones in wing imaginal discs. We will do this by generating twin spots with a nMAGIC gRNA-marker in wing discs and compare the sizes of the two cell populations (BFP<sup>+/+</sup> and BFP<sup>-/-</sup>) in twin spots.

      (4) Include a schematic of the MAGIC method as Figure 1 or add it to Figure 1. Many may not be familiar with the method, so to promote its adoption, the authors should clearly introduce the MAGIC method in this paper (and not rely on readers to go to previous publications). For this paper to become a MAGIC reference paper, it should be self-contained.

      We will add a diagram of the MAGIC method in the revised manuscript.

      (5) Determine the utility of using a hs-Cas9 line for temporal induction of MAGIC clones. This is a traditional method for mitotic clone induction (with hsFLP/FRTs), and its use with the MAGIC system (especially pMAGIC) could also make it more attractive, especially to label small populations of neurons born at known times. To this point, the authors could generate pMAGIC clones using hs-Cas9 for commonly used adult target neurons, such as projection neurons, central complex neurons, or mushroom body neurons. The method to label small numbers of these adult neurons is well worked out with known GAL4 lines, and demonstrating that pMAGIC could have similar results would capture the attention of many not familiar with the pMAGIC method.

      We thank the reviewers for this suggestion. We will test hs-Cas9 in inducing pMAGIC clones in one of the neuronal populations in the adult brain, as suggested by the reviewers.

      In addition, we will address all other minor concerns of the reviewers.

    1. Reviewer #1 (Public review):

      Summary:

      The study by Bobola et al reports single-nucleus expression analysis with some supporting spatial expression data of human embryonic and fetal cardiac outflow tracts compared to adult aortic valves. The transcription factor GATA6 is identified as a top regulator of one of the mesenchymal subpopulations, and potential interacting factors and downstream target genes are identified bioinformatically. Additional bioinformatic tools are used to describe cell lineage relationships and trajectories for developmental and adult cardiac cell types.

      Strengths:

      The studies of human tissue and extensive gene expression data will be valuable to the field.

      Weaknesses:

      (1) The expression data are largely confirmatory of previous studies in humans and mice. Thus, it is not clear what novel biological insights are being reported. While there is some novelty and impact in using human tissue, there are extensive existing publications and data sets in this area.

      (2) Major conclusions regarding spatial localization, differential gene expression, or cell lineage relationships based on bioinformatic data are not validated in the context of intact tissues.

      (3) The conclusions regarding lineage relationships are based on common gene expression in the current study and may not reflect cellular origins or lineage relationships that have previously been reported in genetic mouse models.

      (4) An additional limitation is the exclusive examination of adult aortic valve leaflets that represent only a subset of outflow tract derivatives in the mature heart. The conclusion, as stated in the title regarding adult derivatives of the outflow tract, is not accurate based on the limited adult tissue evaluated, exclusive bioinformatic approach, and lack of experimental lineage analysis of cell origins.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Leshem et al. presents a transcriptomic analysis of the developing human outflow tract (OFT) at embryonic and fetal stages using snRNAseq and spatial transcriptomics. Additionally, the authors analyze transcriptomic data from the adult aortic valve to compare embryonic and adult cell populations, aiming to identify persistent embryonic transcriptional signatures in adult cells. A total of 15 clusters were identified from the embryonic and fetal OFT samples, including three mesenchymal and four endothelial clusters. Using SCENIC analysis on the embryonic snRNAseq data, the authors identified GATA6 as a key regulator of valve precursor cells. Spatial transcriptomic analysis of four fetal OFT sections further revealed the spatial distribution of mesenchymal nuclei, smooth muscle cells, and valvular interstitial cells. Trajectory analysis identified two distinct developmental origins of fetal mesenchymal cells: the neural crest and the second heart field. Finally, the authors used snRNAseq data from the adult aortic valve to propose that embryonic transcriptional signatures persist in a subset of adult cells.

      Strengths:

      (1) The study offers a rich and detailed dataset, combining snRNA-seq and spatial transcriptomics in human embryonic and fetal OFT, which are challenging to obtain.

      (2) The use of SCENIC and trajectory analysis adds mechanistic insight into cell lineage and regulatory programs during valve development.

      (3) This study confirms GATA6 as a key regulator of valve precursor cells.

      (4) Comparison between embryonic/fetal and adult datasets represents a novel attempt to trace persistence of developmental transcriptional programs.

      Weaknesses:

      (1) A major limitation is the lack of experimental validation to support key conclusions, particularly the claim of persistent embryonic transcriptional signatures in adult cells.

      (2) The manuscript would benefit from a clearer discussion of how these results advance beyond previous studies in human heart and valve development.

      (3) The comparison between embryonic and adult data is interesting, but would be more convincing with additional evidence supporting the proposed persistence of embryonic transcriptional signatures in adult cells.

    3. Reviewer #3 (Public review):

      Leshem et al have generated a transcriptional cell atlas of the human outflow tract at two developmental timepoints and its adult valvular derivatives. This carefully performed study provides a useful resource for the study of known genes implicated in outflow tract defects and potentially also for discovering new disease genes. The authors reveal neural crest and mesodermal contributions to different outflow tract components and show that GATA6, known to play a role in arterial valve development, controls a set of genes expressed in endocardium-derived cells during valve development. Interestingly, the results suggest lineage persistence of expression of certain genes through to the adult timepoint, a main new finding of this study.

      The following points should be addressed to reinforce the conclusions and emphasize the novel features of this study.

      (1) It would be helpful to clarify how these new findings confirm or diverge from what is known from analysis of neural crest and mesodermal lineage contributions to different cell populations in the mouse heart. Did the authors identify any human-specific populations of cells, such as the LGR5 population reported by Sahara et al?

      (2) The authors should clarify in the introduction and results that they consider the endocardium to be on the SHF trajectory as indicated in Figure S4C. Please add a reference for this point.

      (3) The GATA6 results are interesting and support this experimental approach. The paper would be reinforced if the authors could provide any functional validation (in addition to their GATA6 genomic occupancy data) that the designated target genes are regulated by GATA6. This might involve looking at mutant mouse embryos or cultured cells. Do the authors consider that GATA6 may regulate the endocardial to mesenchymal transition during the early stages of valve development? Or the valve interstitial cell versus fibroblast fate choice?

      (4) Do the new findings reveal whether human valves have a direct SHF to VIC trajectory (ie, without transiting through endocardium) as has been recently shown in the murine non-coronary valve leaflet? Relevant to this point, Figure 5E appears to show contributions to a single adult aortic valve leaflet - this should be explained, or corrected.

    1. Reviewer #1 (Public review):

      Summary:

      In the study by Wang et al. entitled "Dissecting organoid-bacteria interaction highlights decreased contractile force as a key factor for heart infection", a simple cardiac organoid (CO) model was established, by combining a heterologous mixture of patient-specific human induced pluripotent stem cells (hiPSC)-derived cardiomyocytes (CMs) in combination with primary HUVECs (Human Umbilical Vein Endothelial Cells) and human mesenchymal stem cells (MSCs, representing stromal cells). This model was applied for investigating the interplay of COs' bacterial infections in vitro, aiming at revealing pathological mechanisms of bacterial infections of the heart in vivo, which may induce myocarditis and consequently heart failure in affected patients.

      Strengths:

      The paper is systematic, well written, and easy to follow.

      Based on their results, the authors state that: "In this study, by developing quantitative tools for analyzing bacterial-cardiac organoid interactions in a 3D, dynamic, clinically relevant setting, we discovered the significant role of cardiac contractility in preventing bacterial infection."

      In principle, the idea of establishing a simple yet functionally and physiologically relevant in vitro model and relevant analytical tools for enabling the study of complex pathological mechanisms of cardiovascular diseases is intriguing.

      Weaknesses:

      However, despite the combination of numerous analytical tools established and applied in the study, the work has substantial experimental limitations, indicating that the bold conclusions may represent a misinterpretation or overinterpretation of the findings.

      Key limitations and questions:

      (1) It seems that iPSCs from only one patient ("dilated cardiomyopathy (DCM) cells were derived from a 47-year-old Asian male with an LMNA gene mutation") were used in the study. Moreover, it seems that only one iPSC-line/clone from that DCM patient was used and compared to a single control iPSC line from a "healthy donor". Therefore, despite the different assays and experimental controls used in the study, there is a high risk that the observed phenomena reflect iPSC-line-/ clone-dependent effects, rather than revealing general pathophysiologic mechanisms. Thus, key experiments must be shown by cardiomyocytes/ cardiac organoids derived from additional independent iPSC-lines representing different patients and other non-diseased control lines as well. Moreover, it is established good experimental practice in the iPS cell field to generate and include isogenic iPSC controls i.e. iPSC lines of the same genetic background but with corrections of the hypothesised gene mutation underlying the respective e.g., cardiovascular disease.

      (2) In Figure 1 (A) immunohistochemical staining for cardiomyocytes for the cardiac marker Troponin is shown, apparently indicating successful cardiomyogenic differentiation of the applied hiPSC lines. In supplemental Figure S1, a flow cytometry analysis specific to cTnT is shown to reveal the CMs content resulting from the monolayer differentiation of respective iPSC lines. Already, the exemplified plots indicate that the CMs' content/ purity for DCM-CMs was notably lower compared to healthy cardiomyocytes (CM; control). This is an important issue, since the non-CMs ("contaminating bystander cells") may have a substantial effect on the functional (including contractile) properties of the COs.

      Interestingly, based on the method description, it seems that COs were generated from cryopreserved iPSC-CMs and iPSC-DCMs, including intermediate seeding and culture on Matrigel before COs formation. However, it remains unclear whether the CMs FACS analysis, which is apparently: "Representative FACS plots for analysis of the cell types in DCM monolayer culture after 33 days of differentiation" shows a CMs purity relevant to CO formation, or something different.

      The lineage phenotype of non-CMs in respective differentiations should also be clarified. Moreover, it should be noted in the results that the CMs content in COs is lower than the 6:2:2 (CM:ECs:MSC) ratio indicated by the authors, since the CMs purity is not 100%, and is particularly reduced in the iPSC-DCMs.

      Finally, to investigate the important latter questions of the "real CMs content" in COs, systematic technologies should be applied to quantify the lineage composition in COs (e.g. by IF staining for the 3 lineages plus DAPI, followed by COs clearance, confocal microscopy "3D stags" and automated, ImageJ-based quantitative cell counts for total cell number definition (see e.g. doi: 10.1038/s41596-024-00976-2) per CO, and quantification of respective lineage content as well.

      These questions are of key importance since the presence of non-CMs and their phenotype has profound consequences on the cardiac organoid model, its contractile/ biophysical properties, and, in general, on models' sensitivity to bacterial infections as well.

      (3) Figure 2: (F) Why is this figure (Confocal Observations) showing only healthy cardiac organoids (HCOs) but not DCM-COs?

      The overall quality of these pictures is poor and not informative regarding the structural identity and tissue composition of the COs, which actually is an important topic in the frame of the paper, as the 3D structure and tissue composition - and differences between HCOs and DCM-COs - are of key importance to their contractile properties.

      Moreover, the expective overlay of the cardiac markers alpha-actinin and MHC is not obvious from Figure 2F (see also comments on Figure 7, below).

      In Figure 2E: COs at later stages/days should be shown, in particular at that stage, which was used for the functional assays i.e., bacteria infections and contraction pattern monitoring.

      (4) Figure 7 (A) (B) - In the IF sections, it seems that there is no overlay between the expression of the cardiac marker MHC (seems to be expressed in the centre of COs only) and the cardiac markers alpha-actinin (which seems to be unexpectedly expressed in all cells on the sections) and Troponin (which seems to be vocally expressed on the outside, excluding the area of MHC expression).

      (F) Quantification of the mean area of gene expression, e.g., for MHC indicates a larger area after MHC expression; this seems to entirely contradict the IF pictures (in Figures 7 A-D) of MHC expression before and after infection. This contraction is deemed very critical to this reviewer as it may indicate that the IF staining, data analysis, and/or data interpretation in this part of the manuscript is poor, misleading, or simply wrong.

      (5) Overall, from the perspective of this reviewer, the CO-derived results do not reflect in a meaningful way the contractile and hydrodynamic conditions in the mouse heart or the human heart. Thus, it seems that the conclusions may rather represent a hypothesised outcome bias.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The authors aimed to investigate the cellular mechanisms underlying place field formation (PFF) in hippocampal CA1 pyramidal cells by performing in vivo two-photon calcium imaging in head-restrained mice navigating a virtual environment. Specifically, they sought to determine whether BTSP-like (behavioral time scale synaptic plasticity) events, characterized by large calcium transients, are the primary mechanism driving PFFs or if other mechanisms also play a significant role. Through their extensive imaging dataset, the authors found that while BTSP-like events are prevalent, a substantial fraction of new place fields are formed via non-BTSP-like mechanisms. They further observed that large calcium transients, often associated with BTSP-like events, are not sufficient to induce new place fields, indicating the presence of additional regulatory factors (possibly local dendritic spikes).

      Strengths

      The study makes use of a robust and extensive dataset collected from 163 imaging sessions across 45 mice, providing a comprehensive examination of CA1 place-cell activity during navigation in both familiar and novel virtual environments. The use of two-photon calcium imaging allows the authors to observe the detailed dynamics of neuronal activity and calcium transients, offering insights into the differences between BTSP-like and non-BTSP-like PFF events. The study's ability to distinguish between these two mechanisms and analyze their prevalence under different conditions is a key strength, as it provides a nuanced understanding of how place fields are formed and maintained. The paper supports the idea that BTSP is not the only driving force behind PFF, and other mechanisms are likely sufficient to drive PFF, and BTSP events may also be insufficient to drive PFF in some cases. The longer-than-usual virtual track used in the experiment allowed place cells to express multiple place fields, adding a valuable dimension to the dataset that is typically lacking in similar studies. Additionally, the authors took a conservative approach in classifying PFF events, ensuring that their findings were not confounded by noise or ambiguous activity.

      Weaknesses

      Despite the impressive dataset, there are several methodological and interpretational concerns that limit the impact of the findings. Firstly, the virtual environment appears to be poorly enriched, relying mainly on wall patterns for visual cues, which raises questions about the generalizability of the results to more enriched environments. Prior studies have shown that environmental enrichment can significantly influence spatial coding, and it would be important to determine how a more immersive VR environment might alter the observed PFF dynamics. Secondly, the study relies on deconvolution methods in some cases to infer spiking activity from calcium signals without in vivo ground truth validation. This introduces potential inaccuracies, as deconvolution is an estimate rather than a direct measure of spiking, and any conclusions drawn from these inferred signals should be interpreted with caution. Thirdly, the figures would benefit from clearer statistical annotations and visual enhancements. For example, several plots lack indicators of statistical significance, making it difficult for readers to assess the robustness of the findings. Furthermore, the use of bar plots without displaying underlying data distributions obscures variability, which could be better visualized with violin plots or individual data points. The manuscript would also benefit from a more explicit breakdown of the proportion of place fields categorized as BTSP-like versus non-BTSP-like, along with clearer references to figures throughout the results section. Lastly, the authors' interpretation of their data, particularly regarding the sufficiency of large calcium transients for PFF induction, needs to be more cautious. Without direct confirmation that these transients correspond to actual BTSP events (including associated complex spikes and calcium plateau potentials), concluding that BTSP is not necessary or sufficient for PFF formation is speculative.

      Reviewer #2 (Public review):

      Summary:

      The authors of this manuscript aim to investigate the formation of place fields (PFs) in hippocampal CA1 pyramidal cells. They focus on the role of behavioral time scale synaptic plasticity (BTSP), a mechanism proposed to be crucial for the formation of new PFs. Using in vivo two-photon calcium imaging in head-restrained mice navigating virtual environments, employing a classification method based on calcium activity to categorize the formation of place cells' place fields into BTSP, non-BTSP-like, and investigated their properties.

      Strengths:

      A new method to use calcium imaging to separate BTSP and non-BTSP place field formation. This work offers new methods and factual evidence for other researchers in the field.

      The method enabled the authors to reveal that while many PFs are formed by BTSP-like events, a significant number of PFs emerge with calcium dynamics that do not match BTSP characteristics, suggesting a diversity of mechanisms underlying PF formation. The characteristics of place fields under the first two categories are comprehensively described, including aspects such as formation timing, quantity, and width.

      Weaknesses:

      There are some issues about data and statistics that need to be addressed before these research findings can be considered as rigorous conclusions.

      While the authors mentioned 3 features of PF generated by BTSP during calcium imaging in the Introduction, the classification method used features 1 and 2. The confirmation by feature 3 in its current form is important but not strong enough.

      Some key data is missing such as the excluded PFs, the BTSP/non-BTSP of each animal, etc

      Impact:

      This work is likely to provide a new method to classify BTSP and non-BTSP place field formation using calsium image to the field.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, Sumegi et al. use calcium imaging in head-fixed mice to test whether new place fields tend to emerge due to events that resemble behavioral time scale plasticity (BTSP) or other mechanisms. An impressive dataset was amassed (163 sessions from 45 mice with 500-1000 neurons per sample) to study the spontaneous emergence of new place fields in area CA1 that had the signature of BTSP. The authors observed that place fields could emerge due to BTSP and non-BTSP-like mechanisms. Interestingly, when non-BTSP mechanisms seemed to generate a place field, this tended to occur on a trial with a spontaneous reset in neural coding (a remapping event). Novelty seemed to upregulate non-BTSP events relative to BTSP events. Finally, large calcium transients (presumed plateau potentials) were not sufficient to generate a place field.

      Strengths:

      I found this manuscript to be exceptionally well-written, well-powered, and timely given the outstanding debate and confusion surrounding whether all place fields must arise from BTSP event. Working at the same institute, Albert Lee (e.g. Epszstein et al., 2011 - which should be cited) and Jeff Magee (e.g. Bittner et al., 2017) showed contradictory results for how place fields arise. These accounts have not fully been put toe-to-toe and reconciled in the literature. This manuscript addresses this gap and shows that both accounts are correct - place fields can emerge due to a pre-existing map and due to BTSP.

      We thank the Reviewer for his/her appreciation of the importance of our study. We have included the additional reference.

      Weaknesses:

      I find only three significant areas for improvement in the present study:

      First, can it be concluded that non-BTSP events occur exclusively due to a global remapping event, as stated in the manuscript "these PFF surges included a high fraction of both non-BTSP- and BTSP-like PFF events, and were associated with global remapping of the CA1 representation"? Global remapping has a precise definition that involves quantifying the stability of all place fields recorded. Without a color scale bar in Figure 3D (which should be added), we cannot know whether the overall representations were independent before and after the spontaneous reset. It would be good to know if some neurons are able to maintain place coding (more often than expected by chance), suggestive of a partial-remapping phenomenon.

      We have performed the analysis suggested by the Reviewer and determined what fraction of CA1PCs retained its original tuning property after the representation switch. We found that the remapping was essentially global, as only a small fraction (5.4%) of CA1PCs retained their pre-switch tuning curve after the switch. This is now described in the Results.

      We now state in the figure legend for the former Figure 3D (now Figure 3F) that the color scale applies to all subpanels.

      We would like to note that we do not conclude that non-BTSP events occur exclusively during global remapping – we have found a sizable fraction of PFF by non-BTSP mechanism also in the familiar environment with no signs of change in the population representation. We agree nonetheless that PFF is dominated by BTSP under these conditions, whereas the contribution of non-BTSP is larger during global remapping events.

      Second, BTSP has a flip side that involves the weakening of existing place fields when a novel field emerges. Was this observed in the present study? Presumably place fields can disappear due to this bidirectional BTSP or due to global remapping. For a full comparison of the two phenomena, the disappearance of place fields must also be assessed.

      In this study we focused on the birth of new PFs – yet, PFs not only form but also disappear constantly. The factors driving PF weakening are even less explored and understood than those driving PF birth. In fact, we observed (as illustrated by several examples in our MS) that many PFs weaken, or disappear completely during the course of an imaging session. These effects are sometimes accompanied by a new PFF event elsewhere (e.g. Figure 2 – figure supplement 2E bottom), whereas in other cases they are not (e.g. Figure 5A, middle). Similarly, some BTSP events seem to coincide with disappearance of another PF, but others are not (e.g. Figure 2A bottom, first PF along the track; Figure 3 – figure supplement 1A left, first PF). The picture is further complicated in the case of global remapping events (i.e. representation switches, Figure 3 – figure supplement 2B) that, by definition, include both new PFF and PF disappearance. We feel that exploration of the complex mechanisms at play in PF disappearance is outside the scope of the current study, but could be the subject of an interesting future investigation.

      Finally, it would be good to know if place fields differ according to how they are born. For example, are there differences in reliability, width, peak rate, out-of-field firing, etc for those that arise due to BTSP vs non-BTSP.

      We have analyzed several properties of the PFs and found no significant difference in either their width (BTSP: 46.4 ± 24.4 cm; non-BTSP: 50.4 ± 32.5 cm, p = 0.28) or peak rates (BTSP: 19.0 ± 14.7 a.u./s; non-BTSP: 21.4 ± 16.8 a.u./s, p = 0.27) or the out-of-field firing rates (BTSP: 0.64 ± 0.68 a.u./s; non-BTSP: 0.83 ± 1.25 a.u./s, p = 0.09, all unpaired t-test). We have included these data into the Results section.

      Reviewer #1 (Recommendations for the authors):

      Consider adding additional visual cues or environmental elements to the virtual reality (VR) setup to create a more enriched and immersive environment. Collect data from a couple of mice in the enriched environment and compare the PFF dynamics to the original environment. This would help determine whether the findings on PFF dynamics hold in a setting where spatial coding may be more robust. Including floor cues, distal visual markers, or varying textures might provide a more comprehensive understanding of the factors influencing BTSP-like and non-BTSP-like events.

      We thank the Reviewer for her/his suggestion of analyzing data obtained from a more enriched VR environment compared to the one we used in our study. We have now included data obtained in a profoundly different VR environment, which did not have sparse dominant visual landmarks, but the entire wall was covered with a rich pattern with different shapes of different colors. Our data from 11 imaging sessions from 4 mice revealed BTSP- and non-BTSP-like PFF events with approximately the same ratio to that found in our regular maze. These results are described in the Results section and are presented in a new supplementary figure (Figure 2 – figure supplement 2). 

      Wherever deconvolved spikes were used for analysis, provide a comparison of results obtained directly from the GCaMP ΔF/F signals versus those derived from the deconvolved spiking data. This could illustrate any differences and help readers understand the limitations and reliability of the inference method.

      We have adopted a currently widely accepted method in the field to infer spikes from fluorescent traces using the Suite2p software package. All of our analyses were then performed on the inferred spikes. To address the concerns of the Reviewer, we analyzed the relationship between the peak [Ca<sup>2+</sup>] transients and inferred spike activity (new Figure 3 – figure supplement 1C-E). Our results clearly demonstrate a robust, highly significant correlation between these measures at the level of individual cells (new Figure 3 – figure supplement 1D) and the Spearman correlation coefficients show a distribution that is very different from random distributions (new Figure 3 – figure supplement 1E). From these, we conclude that using directly the fluorescent data would have resulted in largely similar PF detection and identification.

      Improve the visual clarity of figures by enlarging key elements such as arrows that indicate BTSP-like events. Consider using colors that stand out more clearly to guide readers' attention. Include annotations of statistical significance directly on the figures (e.g., adding NS or * indicators) to make it clear which comparisons are statistically significant. This will help readers quickly interpret the data without needing to refer back to the text.

      Based on the suggestion of the Reviewer, we have enlarged the arrows. We have also indicated statistical results on the figures. Because some of the results of factorial ANOVA tests are difficult to be comprehensively indicated on our plots, we kept the description of the statistical results in the legends as well. We hope that these alterations will make data interpretation easier.

      Replace or supplement bar plots with violin plots or scatter plots that show the distribution of individual data points. This change would offer a clearer picture of data variability and underlying trends, aiding readers in assessing the robustness of the results.

      We have changed the plots and now present all data points.

      Add more detailed quantification in the results section, specifying the total number of newly formed place fields, the proportion that are categorized as BTSP-like versus non-BTSP-like, and how many events did not fit these categories. Explicitly state what fraction of the total recorded place field formations are represented by the 59 non-BTSP-like events mentioned, as this is currently difficult to discern.

      The number of BTSP- and non-BTSP-like PFF events are given in the MS. As described in the Methods, after identifying BTSP- and non-BTSP-like PFF events using the shift and gain criteria, we have manually checked each of these ROIs and the spatial footprint of every new PFF events for these cells and excluded ROIs with non-soma-like shapes and activities with spurious footprints suggesting contamination, creating a ‘cleaned’ dataset. We did not perform such visual inspection and manual curation of every ROI’s spatial footprints that belong to the two additional categories (no gain with shift, gain without shift, 872 events). Since these classes are also overestimated without curation, we cannot provide a precise fraction of the BTSP- and non-BTSP-like PFF events from the total recorded PFF population. However, - assuming that factors leading to exclusion affect all groups equally - we can provide their fractions by comparing the numbers of newly born PFs in all categories before the visual inspections. In the normal maze, we found 806 candidate BTSP-like (52%),164 non-BTSP-like (10%) PFFs and an additional 593 PFs (38%) could not be included in these two groups [40 PFs (3%) with formation lap gain and backward shift but significant backward drift; 238 PFs (15%) with formation lap gain but without backward shift; 315 PFs (20%) with no formation lap gain but with backward shift]. These data have been included in the Methods.

      Ensure that all statements describing specific findings are consistently linked to the appropriate figures and panels. There are instances in the text where results are discussed without clear references, which can make it challenging for readers to verify the data. For example, the section on population remapping in a novel environment should point directly to the relevant figure panels to guide readers.

      We regret that our text was not linked properly to the appropriate figures. We corrected this during the revision.

      Given that BTSP-like events are inferred rather than directly confirmed, it would be prudent to frame conclusions about their sufficiency in more tentative terms, acknowledging the limitations of the current data. Consider adding a discussion of potential future experiments that could confirm whether these large transients truly represent BTSP events, including evidence for complex spikes or calcium plateau potentials.

      The Reviewer is correct that we do not have direct evidence that all large somatic Ca<sup>2+</sup> events represent dendritic plateau potentials. Now we discuss this and other limitations in the MS (Discussion section).

      Reviewer #2 (Recommendations for the authors):

      Although the author has outlined three characteristics of place fields (PFs) generated by behavioral time scale synaptic plasticity (BTSP) during calcium imaging in the Introduction section, as follows: ' First, the prolonged CSB results in large [Ca<sup>2+</sup>] transient during the initial PFF event, typically followed by weaker Ca2+ signals on consecutive traversals through the PF. Second, due to the long and asymmetric temporal kernel of the plasticity (favoring potentiation of inputs active 1-2 seconds before the CSB) a substantial backward shift in the spatial position of the PF center can be observed on linear tracks after the formation lap. Third, the width of the new PF is generally proportional to the running speed of the animal during the PFF event.' Figure 3B, which displays the third feature of classified BTSP and non-BTSP data, serves as an important confirmation of the classification results using the first two features. Even though the Spearman correlation indicated a significant difference, the raw data distributions of BTSP and non-BTSP appear similar, suggesting that a distribution of bootstrap and more stringent confirmation should be conducted to be convincing.

      As described in the MS, because of the difference in the number of events in the two groups, we randomly subsampled the BTSP-like events to the sample size of the non-BTSP-like PFF events 10000 times and performed regression analysis. This bootstrapping revealed that both the r and p values of the fit to the non-BTSP data fell outside the 95% confidence interval of the bootstrapped BTSP values, indicating that the difference between the groups was robust.

      In further analysis during the revision, we found that the PF width variance explained by distance from landmarks is substantially larger than the variance explained by the running speed during the formation lap. We performed a cross-validated analysis by these two factors (Figure 3D), which highlights that speed explains some of the PF width variance of BTSP-like PFFs, but none of the non-BTSP PFFs.

      The proportions of the three types should be provided. page 6: ' Using a conservative approach, we categorized a new PF to be formed by a BTSP-like mechanism if it had both positive gain and negative shift values (Figure 2A; n = 310 new PFs), whereas new PFs exhibiting neither positive gain nor negative shift were considered as non-BTSP-like events (Figure 2B; n = 59). All other newly formed PFs (no-gain with backward shift and gain without backward shift) were excluded from further analysis.' The number of excluded newly formed PFs should be disclosed, as well as the distribution ratio of these three types in each animal.

      The number of BTSP- and non-BTSP-like PFF events are given in the MS. As described in the Methods, after identifying BTSP- and non-BTSP-like PFF events using the shift and gain criteria, we have manually checked each of these ROIs and the spatial footprint of every new PFF events for these cells and excluded ROIs with non-soma-like shapes or spurious activities, creating a ‘cleaned’ dataset. We did not perform such visual inspection and manual curation of every ROI’s spatial footprints that belonged to the two additional categories (no gain with shift, gain without shift, 872 events). Since these classes are also overestimated without curation, we cannot provide a precise fraction of the BTSP- and non-BTSP-like PFF events from the total recorded PFF population. However, - assuming that factors leading to exclusion affect all groups equally - we can provide their fractions by comparing the numbers of newly born PFs in all categories before the visual inspections. In the normal maze, we found 806 candidate BTSP-like (52%),164 non-BTSP-like (10%) PFFs and an additional 593 PFs (38%) could not be included in these two groups [40 PFs (3%) with formation lap gain and backward shift but significant backward drift; 238 PFs (15%) with formation lap gain but without backward shift; 315 PFs (20%) with no formation lap gain but with backward shift]. These data have been included in the Methods.

      Figure 2C, while showing an overall decrease in amplitude from the formation lap to the next lap, could benefit from a pairwise analysis of the corresponding formation lap and the following lap of each session to provide more convincing and detailed results.

      We now present all data with connected lines across consecutive laps to illustrate the changes in each ROI. Our statistical analysis included the pairwise comparison of amplitudes.

      The experiment's time range is broad (11-99 days); it is worth investigating whether different training intervals might influence the results.

      Based on the suggestion of the Reviewer, we have analyzed the elapsed time and the number of sessions from the first training to the recording, and we demonstrate that there is no correlation of these parameters with the number of new PFFs. These data are now presented in Figure 2 – figure supplement 1C.

      It is unclear whether the formation of place fields also generates characteristic features of dendritic properties.

      It is not clear to us which ‘characteristic dendritic features of dendritic properties’ generated by PFF the Reviewer refers to. Since we did not image dendrites of individual CA1PCs, we have no information about dendritic properties of the neurons.

      It may be necessary to add a clearer figure to illustrate the correlation between width and speed following the downsampling of non-BTSP-like events (refer to Figure 3B).

      We have performed extensive additional analysis on the relationship of PF width with various behavioral factors, including the speed of the animal in the formation lap. Inspection of the PF width distributions along the track revealed a close association of PF width with the distance of the animal from the nearest visual landmark in the corridor, so that PFs close to landmarks were narrower than PFs between landmarks. We found that the PF width variance explained by distance from landmarks is substantially larger than the variance explained by the running speed during the formation lap. Nevertheless, there is a clear difference between BTSP-like and non-BTSP-like PFFs: running speed explains some variance in the case of BTSP-like PFFs, but none for non-BTSP-like PFFs.

      We have included these findings into the Results section and created two new panels in Figure 3 (C, D) and Figure 3 – figure supplement 1 (A, B).

      It is recommended that statistical results be labeled in the figures with n.s. or stars for better readability.

      Based on the suggestion of the Reviewer, we have indicated statistical results on the figures. Because some of the results of factorial ANOVA tests are difficult to be comprehensively indicated on our plots, we kept the description of the statistical results in the legends as well. We hope that these alterations will make data interpretation easier. We hope that these alterations will make data interpretation easier.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Lu & Cui et al. observe that adult male zebrafish are more resistant to infection and disease following exposure to Spring Viremia of Carp Virus (SVCV) than female fish. The authors then attempt to identify some of the molecular underpinnings of this apparent sexual dimorphism and focus their investigations on a gene called cytochrome P450, family 17, subfamily A, polypeptide 2 (cyp17a2) because it was among the genes that they found to be more highly expressed in kidney tissue from males than in females. Their investigations lead them to propose a direct connection between cyp17a2 and modulation of interferon signaling as the key underlying driver of the difference between male and female susceptibility to SVCV.

      Strengths:

      Strengths of this study include the interesting observation of a substantial difference between adult male and female zebrafish in their susceptibility to SVCV, and also the breadth of experiments that were performed linking cyp17a2 to infection phenotypes and molecularly to the stability of host and virus proteins in cell lines. The authors place the infection phenotype in an interesting and complex context of many other sexual dimorphisms in infection phenotypes in vertebrates. This study succeeds in highlighting an unexpected factor involved in antiviral immunity that will be an important subject for future investigations of infection, metabolism, and other contexts.

      Weaknesses:

      Weaknesses of this study include an indirect connection between the majority of experiments and the proposed mechanism underlying the sexual dimorphism phenotype, widespread reliance on over-expression when investigating protein-protein interaction and localization, and an insufficient amount of description of the data presented in the figures. Specific examples of areas for clarification or improvement include:

      (1) Figure 10 outlines a mechanistic link between cyp17a2 and the sexual dimorphism the authors report for SVCV infection outcomes. The data presented on increased susceptibility of cyp17a2-/- mutant male zebrafish support this diagram, but this conclusion is fairly weak without additional experimentation in both males and females. The authors justify their decision to focus on males by stating that they wanted to avoid potential androgen-mediated phenotypes in the cpy17a2 mutant background (lines 152-156), but this appears to be speculation. It also doesn't preclude the possibility of testing the effects of increased cyp17a2 expression on viral infection in both males and females. This is of critical importance if the authors intend to focus the study on sexual dimorphism, which is how the introduction and discussion are currently structured.

      (2) The authors present data indicating an unexpected link between cyp17a2 and ubiquitination pathways. It is unclear how a CYP450 family member would carry out such activities, and this warrants much more attention. One brief paragraph in the discussion (starting at line 448) mentions previous implications of CYP450 proteins in antiviral immunity, but given that most of the data presented in the paper attempt to characterize cyp17a2 as a direct interactor of ubiquitination factors, more discussion in the text should be devoted to this topic. For example, are there any known domains in this protein that make sense in this context? Discussion of this interface is more relevant to the study than the general overview of sexual dimorphism that is currently highlighted in the discussion and throughout the text.

      (3) Figures 2-9 contain information that could be streamlined to highlight the main points the authors hope to make through a combination of editing, removal, and movement to supplemental materials. There is a consistent lack of clarity in these figures that could be improved by supplementing them with more text to accompany the supplemental figures. Using Figure 2 and an example, panel (A) could be removed as unnecessary, panel (B) could be exchanged for a volcano plot with examples highlighting why cyp17a2 was selected for further study and also the full dataset could be shared in a supplemental table, panel (C) could be modified to indicate why that particular subset was chosen for plotting along with an explanation of the scaling, panel (D) could be moved to supplemental because the point is redundant with panels (A) and (C), panel (E) could be presented as a heatmap, in panels (G) and (H) data from EPC cells could be moved to supplemental because it is not central to the phenotype under investigation, panels (J) to (L) and (N) to (P) could be moved to supplemental because they are redundant with the main points made in panels (M) and (Q). Similar considerations could be made with Figures 3-9

      (4) The data in Figure 3 (A)-(C) do not seem to match the description in the text. That is, the authors state that cyp17a2 overexpression increases interferon signaling activity in cells, but the figure shows higher increases in vector controls. Additionally, the data in panel (H) are not described. What genes were selected and why, and where are the data on the rest of the genes from this analysis? This should be shared in a supplemental table.

      (5) Some of the reagents described in the methods do not have cited support for the applications used in the study. For example, the antibody for TRIM11 (line 624, data in Figures 6 & 7) was generated for targeting the human protein. Validation for use of this reagent in zebrafish should be presented or cited. Furthermore, the accepted zebrafish nomenclature for this gene would be preferred throughout the text, which is bloodthirsty-related gene family, member 32.

    2. Reviewer #2 (Public review):

      The manuscript identified Cyp17a2 as a master regulator of male-biased antiviral immunity in a sex chromosome-free model (zebrafish) challenging established immunological paradigms.

      Strengths:

      (1) The bifunctional role of Cyp17a2 (host-directed STING stabilization and virus-directed P degradation) represents a significant conceptual advance.

      (2) First demonstration of K33 chains as a critical regulatory switch for both host defense proteins and viral substrates.

      (3) Comprehensive validation across biological scales: organismal (survival, histopathology), cellular (transcriptomics, Co-IPs), and molecular (ubiquitination assays, site-directed mutagenesis).

      (4) Functional conservation in cyprinids (zebrafish and gibel carp) strengthens biological significance.

      Weaknesses:

      (1) Colocalization analyses (Figures 4G, 6I, 9D) require quantitative metrics (e.g., Pearson's coefficients) rather than representative images alone.

      (2) Figure 1 survival curves need annotated statistical tests (e.g., "Log-rank test, p=X.XX")

      (3) Figure 2P GSEA should report exact FDR-adjusted *p*-values (not just "*p*<0.05").

      (4) Section 2 overextends on teleost sex-determination diversity, condensing to emphasize relevance to immune dimorphism would strengthen narrative cohesion.

      (5) Limited discussion on whether this mechanism extends beyond Cyprinidae and its implications for teleost adaptation.

    1. Reviewer #3 (Public review):

      Summary:

      The authors tackle an important problem: defining the topological changes that occur during tumorigenesis. To study this, they use an established stepwise cell model of breast cancer. A strength of their study is a careful, robust differential analysis of topological features across each cell state, which is presented clearly and rigorously. They define changes in compartmentalization, TAD structure, and chromatin looping. Intriguingly, when the authors integrate differential gene expression with chromatin looping, they see that most differentially regulated genes are not involved in loop changes, suggesting that changes in promoter or enhancer chromatin marks may play a bigger role in regulating transcription than differential loops. The differential topology analysis and its integration with transcription is very well done- one of the best versions of this I have read in the 3D genome field! However, the paper is framed largely as a cancer biology study, and it teaches us much less about this. I am worried that some of the trends for each topologic feature are not going to be consistent across the pre-malignant-malignant-metastatic spectrum and would like the authors to soften some of their claims a bit regarding how this clarifies our understanding of cancer evolution.

      Weaknesses:

      Major Concerns:

      (1) The integration of gene expression and chromatin loops is intriguing. The authors' differential analysis, however, omits consideration of genes that are on and simply further upregulated versus genes that transition on/off or off/on. It would be nice to see the authors break out looping patterns for these two different patterns of regulation, as it may be instructive regarding the rules for how EP loops govern transcription.

      (2) Given the paucity of differential loops at the majority of genes whose expression changes, the authors should examine chromatin subcompartments, as these may associate more with differential transcription.

      (3) The authors could push their TAD analysis further by integrating it with transcription. Can they look at genes and their enhancers that span these altered boundaries to see if these shifts impact transcription?

      (4) The progression of cancer critically goes from a benign -> pre-malignant -> malignant -> metastatic series of steps. The AT1 line is described as 'premalignant' and thus the authors' series omits a malignant line. While I think adding such a sample is an unreasonable request at this point (as it would have had to have been studied in 'batch' with these other samples), the authors should acknowledge that they omit this step and spend some time discussing the genetic, morphologic, and phenotypic features for their 3 conditions. The images in Figure 1S aren't particularly useful- they don't tell the reader that these cells are malignant/benign. The karyotypic data are intriguing but not fully analyzed, so it is hard to know what true phenotype these cells represent. For example, malignant means DCIS/invasive carcinoma - so then what does this pre-malignant cell model represent? The described alteration in the AT1 line is a Ras oncogene, so in some sense, the transition to this line really is just +/- Ras. The authors could spend some time thinking about the effects of Ras specifically on the 3D genome.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      The question of how central nervous system (CNS) lamination defects affect functional integrity is an interesting topic, though it remains a subject of debate. The authors focused on the retina, which is a relatively simple yet well-laminated tissue, to investigate the impact of afadin - a key component of adherens junctions on retinal structure and function. Their findings show that the loss of afadin leads to significant disruptions in outer retinal lamination, affecting the morphology and localization of photoreceptors and their synapses, as illustrated by high-quality images. Despite these severe changes, the study found that some functions of the retinal circuits, such as the ability to process light stimuli, could still be partially preserved. This research offers new insights into the relationship between retinal lamination and neural circuit function, suggesting that altered retinal morphology does not completely eliminate the capacity for visual information processing.

      Strengths:

      The retina serves as an excellent model for investigating lamination defects and functional integrity due to its relatively simple yet well-organized structure, along with the ease of analyzing visual function. The images depicting outer retinal lamination, as well as the morphology and localization of photoreceptors and their synapses, are clear and well-described. The paper is logically organized, progressing from structural defects to functional analysis. Additionally, the manuscript includes a comprehensive discussion of the findings and their implications.

      Weaknesses:

      While this work presents a wealth of descriptive data, it lacks quantification, which would help readers fully understand the findings and compare results with those from other studies. Furthermore, the molecular mechanisms underlying the defects caused by afadin deletion were not explored, leaving the role of afadin and its intracellular signaling pathways in retinal cells unclear. Finally, the study relied solely on electrophysiological recordings to demonstrate RGC function, which may not be robust enough to support the conclusions. Incorporating additional experiments, such as visual behavior tests, would strengthen the overall conclusions. 

      We would like to thank the reviewer for the thoughtful and valuable comments that helped us to further improve the manuscript. We have revised the manuscript to address the following three points in response to the reviewer's comments.

      While this work presents a wealth of descriptive data, it lacks quantification, which would help readers fully understand the findings and compare results with those from other studies.

      In response, we quantified the position of each retinal cell type and measured retinal thickness in the cHet and cKO mice at 1M, as presented in Figures 2F–M. To reflect these additions, we have included explanatory text in the revised manuscript (see lines 507–533).

      Furthermore, the molecular mechanisms underlying the defects caused by afadin deletion were not explored, leaving the role of afadin and its intracellular signaling pathways in retinal cells unclear.

      As AJ components, such as catenin and cadherin, are known to be associated with several signaling pathways, including Notch and Wnt signals (PMID: 37255594), we speculated that these pathways might be disrupted in the afadin cKO retina. Since these pathways are involved in cell proliferation, we examined the number of progenitor cells in the afadin cKO retina at developmental stages P1, P3, and P5 (new Figure S6C, see lines 868-870). No significant differences were observed at any of these stages. We also quantified the number of each retinal cell type at P14 when differentiation is complete. In the cKO retina, the number of BCs significantly increased, whereas the number of photoreceptors significantly reduced (new Figure S4C, see lines 620-622). To our knowledge, activation or inactivation of any AJ-associated signaling pathway does not reproduce the cell fate alterations observed in the afadin cKO retina. These findings suggest that the above pathways related to AJ may be unchanged in the cKO retina. However, we cannot exclude the possibility that multiple signaling pathways may be affected simultaneously or other pathways affected in the cKO retina.

      Finally, the study relied solely on electrophysiological recordings to demonstrate RGC function, which may not be robust enough to support the conclusions. Incorporating additional experiments, such as visual behavior tests, would strengthen the overall conclusions.

      We appreciate the reviewer’s insightful suggestion. To more robustly evaluate visual function in the cKO mice, we performed optomotor response (OMR) and visual cliff tests using cHet, cKO, and optic nerve crush (ONC) mice with Aki Hashio, Yuki Emori, and Mao Hiratsuka. We added their name as co-authors to the new manuscript. In the OMR test, cKO mice exhibited fewer responses to visual stimuli than cHet mice but significantly more than ONC mice. Furthermore, although no significant difference was detected between cKO and ONC mice in the visual cliff test, some cKO mice displayed cautious behavior suggestive of depth perception. These results indicate that cKO mice retain partial visual function, which is consistent with the MEA analysis. We have included these data as the new Figure 8 and incorporated the findings into the revised manuscript in the Introduction (lines 130-131 and 133-134), Methods (lines 378-406), Results (lines 775-816), and Discussion sections (lines 1026-1035).

      Reviewer #2 (Public review):

      Summary:

      Ueno et al. described substantial changes in the afadin knockout retina. These changes include decreased numbers of rods and cones, an increased number of bipolar cells, and disrupted somatic and synaptic organization of the outer limiting membrane, outer nuclear layer, and outer plexiform layer. In contrast, the number and organization of amacrine cells and retinal ganglion cells remain relatively intact. They also observed changes in ERG responses and RGC receptive fields and functions using MEA recordings.<br /> Strengths:

      The morphological characterization of retinal cell types and laminations is detailed and relatively comprehensive.

      Weaknesses:

      (1) The major weakness of this study, perhaps, is that its findings are predominantly descriptive and lack any mechanistic explanation. As afadin is key component of adherent junctions, its role in mediating retinal lamination has been reported previously (see PMCID: PMC6284407). Thus, a more detailed dissection of afadin's role in processes, such as progenitor generation, cell migration, or the formation of retinal lamination would provide greater insight into the defects caused by knocking out afadin.

      Thank you for valuable comments. We agree with the reviewer's point that findings are predominantly descriptive and lack any mechanistic explanation. However, we would like to clarify that the study cited in the comment (PMCID: PMC6284407) analyzed the role of afadin in dendritic stratification of direction-selective RGCs within the IPL, where “lamination” refers to the layering of RGC dendrites in the IPL. Here, we analyzed the function of afadin in the laminar construction of the overall retina.

      In response to the reviewer’s comment, we have added new analyses addressing retinal lamination, as well as the number and spatial distribution of progenitor cells, during development in the cKO retina. These new results are shown in Figures 4E, 9C–F, S5A–C, and S6C of the revised manuscript, and corresponding explanations added in the revised text (lines 643–662 and 855–870).

      (2) The authors observed striking changes in the numbers of rods, cones, and BCs, but not in ACs or RGCs. The causes of these distinct changes in specific cell classes remain unclear. Detailed characterizations, such as the expression of afadin in early developing retina, tracing cell numbers across various early developmental time points, and staining of apoptotic markers in developing retinal cells, could help to distinguish between defects in cell generation and survival, providing a better understand of the underlying causes of these phenotypes.

      Thank you for the insightful comment. Following the reviewer’s suggestion, we quantified the number of retinal cell types at P14 when cell differentiation is complete (new Figure S4C). At P14, the numbers of photoreceptors and BCs were significantly reduced in the cKO retina, while Müller glia, which was significantly reduced at 1M, showed no difference. We further examined the number of rods and BCs at P1, P3, and P5 (new Figures S4E, F). No significant differences were detected at P1 or P3, however, at P5, rod marker expression was significantly decreased, while the number of BCs was significantly increased. These results suggest that the defects in cell fate determination of BCs and rods begin to emerge between P3 and P5, a period for which rods and BCs actively differentiate. We speculate that cells originally destined to become rods may instead differentiate into BCs in the cKO retina. In addition, we found a significant increase in apoptotic cells at P1, P3, P5, and P14 (new Figure S6B). Furthermore, Müller glia and rod photoreceptors showed significantly greater reduction at 1M compared to P14, suggesting that the reduction in Müller glia observed at 1M may be due to post-differentiation cell death. These are presented in Figures S4C, S4E–F, and S6B, and described in the revised manuscript (lines 620-635 and 827-838).

      (3) Although the total number of ACs or RGCs remains unchanged, their localizations are somewhat altered (Figures 2E and 4E). Again, the cause of the altered somatic localization in ACs and RGCs is unclear.

      Thank you for the valuable question. In response to the reviewer’s comment, we analyzed the position of RGCs and ACs in the developing cKO retina. In the cKO retina at P1, retinal cells were organized into distinct multicellular compartments with clear boundaries, and acellular regions extending to the outer retinal surface were observed at these boundaries. These acellular regions contained dendritic processes of RGCs and ACs, which are components of the IPL, indicating that elements of the IPL extended vertically across the retina. As development progressed, the compartment boundaries gradually shifted toward the inner retina. At P14, the IPL was mainly located on the inner retina, as in the normal retina. However, some IPL structures remained in the outer retina and may correspond to the acellular patches. We have included the above data in the revised manuscript as Figures S5A and S5B and revised the manuscript to include this point (lines 643-660).

      (4) One conclusion that the authors emphasise is that the function of RGCs remains detectable despite a major disrupted outer plexiform layer. However, the organization of the inner plexiform layer remains largely intact, and the axonal innervation of BCs remains unchanged. This could explain the function integrity of RGCs. In addition, the resolution of detecting RGCs by MEA is low, as they only detected 5 clusters in heterozygous animals. This represents an incomplete clustering of RGC functional types and does not provide a full picture of how functional RGC types are altered in the afadin knockout.

      We appreciate the reviewer’s insightful comments. Although our clustering of RGC subtypes in afadin cHet retinas resulted in only five clusters, the key finding of our study is the preservation of RGC receptive fields in afadin cKO retinas, despite severe photoreceptor loss (reduced to about one-third of normal) and disruption of photoreceptor-bipolar cell synapses in the OPL. This suggests that even with crucial damage to the OPL, the primary photoreceptor-bipolar-RGC pathway can still function as long as the IPL remains intact. Moreover, the presence of rod-driven responses in RGCs indicates that the AII amacrine cell-mediated rod pathway may also continue to function. We agree that our functional clustering in afadin cHet retinas was incomplete. However, we guess that the absence of RGCs with fast temporal responses in afadin cKO retinas may not simply be due to the loss of specific RGC subtypes but due to disrupted synaptic connections between photoreceptors and fast-responding BCs. Furthermore, the structural abnormalities in retinal lamination in afadin cKO retinas may alter RGC response properties, making strict functional classification less meaningful. We would like to emphasize the finding that disruption of the retinal lamination in afadin cKO retinas leads to the absence of RGCs with fast temporal response properties, rather than focusing solely on the classification of RGC subtypes.

      Minor Comments:

      (1) Line 56-67: "Overall, these findings provide the first evidence that retinal circuit function can be partially preserved even when there are significant disruptions in retinal lamination and photoreceptor synapses" There is existing evidence showing substantial adaption in retinal function when retinal lamination or photoreceptor synapses are disrupted, such as PMCID: PMC10133175.

      Thank you for your comment. We agree that the original sentence was ambiguous in its wording, and we have revised it to clarify our intended meaning (lines 48-50):

      "Overall, these findings provide the first evidence that retinal circuit function can be partially preserved even when there are significant disruptions in both retinal lamination and photoreceptor synapses."

      In response, we have cited this study and added the following sentence to the Discussion section of the revised manuscript. The paper you mentioned is crucial for discussing and considering the results of our study. We have cited this study and added the following sentence to the Discussion section of the revised manuscript (lines 910-915):

      “Furthermore, RFs of RGCs are also detected in several mouse models of retinitis pigmentosa, in which rod photoreceptors are degenerated and surviving cone photoreceptors lose their OS discs and pedicles, instead forming abnormal processes resembling synaptic dendrites (Barhoum et al., 2008; Ellis et al., 2023; Scalabrino et al., 2022).”

      (2) Line 114-115: "we focused on afadin, which is a scaffolding protein for nectin and has no ortholog in mice." The term "Ortholog" is misused here, as the mouse has an afadin gene. Should the intended meaning be that afadin has no other isoforms in mouse?

      Thank you for pointing it out. As we misused "Ortholog" as "Paralog", we revised the sentence (line 108).

      Recommendations for the authors:

      (1) The introduction to afadin is insufficient. Please provide more background information about this protein.

      Following the reviewer’s recommendations, we expanded the Introduction in the revised manuscript to provide a more detailed background on afadin, as follows (lines 108-119):

      “Afadin regulates the localization of nectin, which initiates cell–cell adhesion and promotes AJ formation by recruiting the cadherin–catenin complex. (Ohama et al., 2018; Takai and Nakanishi, 2003). In addition, afadin interacts with various cell adhesion and signaling molecules, as well as the actin cytoskeleton, and contributes to the accumulation of β-catenin, αE-catenin, and E-cadherin at AJs (Sakakibara et al., 2018; Sato et al., 2006). Afadin KO mice exhibit severe disruption of AJs in the ectoderm, along with other developmental defects, leading to embryonic lethality (Ikeda et al., 1999; Zhadanov et al., 1999). Conditional deletion of afadin in RGCs leads to disruption of dendrites in ON-OFF direction-selective RGCs (Duan et al., 2018). However, the effect of afadin loss on retinal lamination, circuit formation, and function is poorly understood.”

      (2) In Figure 1A (Bottom), regarding the peptide+ image, what does the green signal represent?

      The green signal observed in the peptide+ image represents the background and non-specific staining. We have added the sentence to the legend of Figure 1A in the revised manuscript (lines 1067-1068).

      (3) In the RESULTS section on page 17, the statement "Nectin-1, unlike nectin-2 and nectin-3, was partially co-localized with afadin at the OPL and IPL, in addition to the OLM" suggests that nectin-2 is also expressed at the IPL, as shown in Figure S1A. Providing high-power images, similar to those in Figure S1B, could help readers clearly recognize the staining signals.

      Following your suggestion, we added higher-magnification images of Nectin-2 signals in the IPL to Figure S1A and included the following clarification in the Figure legend (lines 1356-1358):

      “Nectin-2 and nectin-3 were localized in the OLM. The Nectin-2 signal in the IPL was insufficient for reliable assessment of its localization and colocalization.”

      (4) Figure S2A requires an uncropped scan of the membrane after Western blotting to demonstrate that there are no non-specific bands when using this afadin antibody, which was also utilized for IHC.

      We revised the new Figure S2C to include the uncropped membrane scan. Faint non-specific bands were observed in the Western blot, consistent with detecting non-specific signals in immunostaining using the anti-afadin antibody pre-absorbed with its antigen peptide.

      (5) IHC staining is necessary to demonstrate the knockout of afadin in retinal cells, as the paper does not show Cre expression in the retinal cells of the Dkk3-Cre mouse line. This would also help verify the specificity of the afadin antibody.

      In the cKO retina, the laminar structure was disrupted, and the background signal was generally high, making it difficult to reliably assess whether afadin expression was lost using immunostaining with the anti-afadin antibody. Therefore, in addition to the Western blot analysis already presented, we evaluated Cre activity in the Dkk3-Cre mouse line by crossing it with the R26-H2B-EGFP reporter line. Cre-mediated recombination was observed in all retinal cells at P0 and 1M. We have added these results to a revised Figure S2A and B and included explanatory text in the revised manuscript (lines 455–458).

      (6) Why is the outer nuclear layer (ONL) severely impaired in the cKO mice when afadin is not expressed in this layer? Additionally, given that afadin is highly expressed in the inner plexiform layer (IPL), why does the cKO not affect its structure?

      We speculate that the AJ defect in the outer retina during development may cause severe disruption of the ONL in afadin cKO mice. As shown in new Figure 9, ectopic AJs and aberrant position of mitotic cells were observed in the P0 cKO retina. These defects caused abnormal cell migration and position, resulting in the ONL disruption. On the other hand, in the IPL, afadin and other cell adhesion molecules may function redundantly, and thus, the IPL structure would be kept intact in the afadin cKO retina. We have added this interpretation to the Discussion section of the revised manuscript (lines 998–1005).

      (7) In the RESULTS section on page 20, the authors state, "We further investigated adherens junctions (AJs) in the cKO retina by immunostaining with OLM adherens junction markers β-catenin, N-cadherin, and nectin-1. We found that these signals were dispersed in the cKO retina (Figure S2C)." It appears that β-catenin, N-cadherin, and nectin-1 can still be detected in the cKO retina.

      We agree with the reviewer that β-catenin, N-cadherin, and nectin-1 can still be detected in the cKO retina. We used the term 'dispersed' to indicate that the signal was “scattered” rather than “disappeared”. To avoid confusion, we have revised the wording in the revised manuscript (line 499).\

      (8) In Figure 3, please indicate where the zoomed-in images were captured from the low-power images. Additionally, point out the locations of zoomed-in images in other figures as well.

      Following the reviewer’s suggestion, we updated Figures 2D, 3A-C, 4A, S2D, S3A, S3D, S3E, and S5D. The related Figure legends have also been revised.

      (9) The authors should include individual data points in all statistical graphics to provide a clearer presentation of the data.

      As suggested by the reviewers, we have revised all statistical graphs to display individual data points. Furthermore, the statistical analysis of synapse counts in Figures 3E, 3F, and S3C has been changed to linear mixed models (LMM) or generalized LMM to account for the variability in the number of synapses within individual mice.

      (10) In the RESULTS section on page 23, the statement "These data indicate that the rosette-like structure in the cKO may be an ectopic IPL, termed 'acellular patches'". What is the mechanism that may cause the rosette-like structure to translocate from the IPL to the outer region of the retina?

      Thank you for raising a valuable question. To clarify the mechanism of acellular patch formation in the cKO mice, we analyzed the position of RGCs and ACs in the developing cKO retina. In the cKO retina at P1, retinal cells were organized into distinct multicellular compartments with clear boundaries, and acellular regions extending to the outer retinal surface were observed at these boundaries. These acellular regions contained dendritic processes of RGCs and ACs, which are components of the IPL, indicating that elements of the IPL extended vertically across the retina. As development progressed, the compartment boundaries gradually shifted toward the inner retina. At P14, the IPL was mainly located on the inner retina, as in the normal retina. However, some IPL structures remained in the outer retina and may correspond to the acellular patches. We have included these findings in the revised manuscript as Figures S5A and S5B and added the corresponding description to the text (lines 643–665).

      (11) Is the blood vessel structure normal in the cKO retina? Could this impact the survival of retinal cells?

      Thank you for your valuable comment. We performed immunostaining with an anti-CD31 antibody, a marker for blood vessels, as shown in the new Figure S2G. No apparent differences were observed in the cKO retina. We have added the following description to the revised manuscript (lines 539–543):

      “It has been reported that defects in the distal processes of Müller glia are associated with abnormal retinal vasculature (Shen et al., 2012). Thus, we immunostained the cKO retina with anti-CD31, a blood vessel marker, but no apparent vascular abnormalities were detected (Figure S2G).”

      (12) In the RESULTS section on pages 26-29, there is a lot of statistical information included in parentheses. It would be more concise to place this information in the figure legends, if possible.

      Following the reviewer's suggestion, we have moved the statistical information from the main text (pages 26–29) to the corresponding Figure legends.

      (13) In the RESULTS section on page 28, the authors state, "On the other hand, the inner retina was apparently normal, and both the inner nuclear layer (INL) and IPL could be recognized." However, in Fig 7A, it appears that the INL is mixed with the ONL and cannot be clearly identified.

      We agree with the reviewer that the INL is mixed with the ONL and cannot be clearly identified. Accordingly, we have revised the description in the text (lines 740–742) as follows:

      “On the other hand, the inner retina was apparently normal, and both the IPL and the proximal part of the INL could be recognized.”.

      (14) It is mentioned in the manuscript that "The receptive field (RF) area in the cKO retinas was significantly smaller than that in the cHet retinas." Is there an impairment in the dendritic fields of RGCs in the cKO retina that could lead to a smaller RF?

      Thank you for asking an interesting question. The dendritic field reflects the region where presynaptic cells can form synaptic contacts, whereas the receptive field is dynamically shaped by spatiotemporal excitatory and inhibitory inputs, gap junctions, and membrane properties of the dendrites. Consequently, the size of the dendritic field does not necessarily correspond to that of the receptive field. Moreover, the disruption of the retinal lamination in the afadin cKO retina may alter the morphology of RGC dendritic fields—even when RNA expression levels are identical—which makes it difficult to exactly compare the morphology of the same RGC subtype between afadin cHet and afadin cKO retinas. Additionally, due to the presence of over 40 RGC subtypes and the rosette-like structures in the afadin cKO retina, it is challenging to trace the complete dendritic arborization of individual RGCs. For these reasons, we rather hesitate to compare the dendritic field size and the receptive field size.

      (15) Figure 7H was not cited in the corresponding section of the main text.

      Thank you for pointing it out. We have added a citation of Figure 7H in the revised manuscript (line 759).

      (16) In Figure 8C, is there a difference in the number of pHH3+ mitotic cells between the cHet and cKO mice?

      We quantified the number of pHH3-positive cells in the cKO retina at P0, as shown in the new Figure 9B. The number of mitotic cells was significantly increased in the cKO retina (see lines 853-855). In contrast, the number of BrdU-labeled progenitor cells at P1, P3, and P5 was not significantly different between cHet and cKO retinas, as presented in the new Figure S6C. These results suggest that although the total number of progenitor cells remain unchanged in cKO retinas, the M phase may be prolonged.

      (17) The results related to Figure 8 should be moved to a location before Figure 5, as Figure 8 is also related to the lamination defects.

      In the original manuscript, Figures 2–7 presented the phenotypes observed in the cKO retina, while Figure 8 addressed the possible cause of the lamination defects. Since the revised Figure 8 presents behavioral tests evaluating visual function, the phenotypic analyses are presented in the revised Figures 2–8. In response to the reviewers’ comments, we further analyzed the distribution of mitotic and progenitor cells during development and included these results as revised Figure 9.

      (18) In the DISCUSSION section on page 32, the authors state, "A few photoreceptor-bipolar cell-retinal ganglion cell (BC-RGC) pathways (vertical pathways of the retina) are inferred to be maintained in the cKO retina." The authors could verify this using retrograde transsynaptic tracing with a pseudorabies virus injected into the superior colliculus.

      Thank you for your interesting suggestion. This is an important point, and the recommended experiment idea sounds excellent. We attempted this analysis; however, the virus injected into the superior colliculus successfully labeled RGCs but failed to reach BCs and photoreceptors in normal mice. We guess that light stimulation evoked RGC firings evidently show that the photoreceptor-bipolar cell-retinal ganglion cell (BC-RGC) pathways function.

    1. El volumen sanguíneo total constituye de 7% a 8% del peso corporal de un adulto, lo que indica que un individuo de 70 kilos tiene aproximadamente 5.5 L de sangre. Este volumen está compuesto tanto por el líquido extracelular ―que es el plasma― como por los elementos formes de la sangre ―que también contienen líquido en su interior―, sobre todo los eritrocitos, que contienen hemoglobina disuelta en agua. Teniendo en cuenta que el hematócrito representa 45% del volumen sanguíneo total, un adulto tiene aproximadamente 2000 a 2400 mL de eritrocitos, que representan de 3.2% a 3.5% del peso corporal y de 3000 a 3500 mL de plasma, que representa 5% del peso total. En los varones, el volumen de sangre es de 61 a 66 mL/kg, de los cuales, 24 a 30 mL/kg son eritrocitos, y 33 a 35 mL/kg son plasma. En las mujeres, estos valores son de 1 a 2 mL menos. En relación con la superficie corporal, el volumen sanguíneo de un adulto es de 2.89 L/m2 en el hombre y 2.44 L/m2 en la mujer. En los jóvenes y en los niños, el volumen es relativamente mayor. Al nacimiento, el volumen sanguíneo es de casi 300 mL, y se duplica durante el primer año de vida. En los lactantes, el volumen sanguíneo es hasta de 83 mL/kg. Más de la mitad del volumen total de sangre, esto es, 64%, se encuentra en las venas; 13% en las arterias, 9% en la circulación pulmonar, 7% en los capilares y arteriolas, y el otro 7% en las cavidades cardiacas. En condiciones normales, la frecuencia cardiaca es de 75 latidos por minuto, con una entrega de sangre a la aorta de 70 mL por latido, que da un total de aproximadamente 5.25 litros/minuto. La sangre se distribuye de la siguiente manera en los principales órganos: cerebro: 14%; circulación coronaria: 4%; riñones: 22%; hígado: 27%; músculos en reposo: 15%; huesos: 5%; piel: 6%; bronquios: 2%. ++ Durante el ejercicio, la frecuencia cardiaca se puede incrementar a 150 latidos por minuto, con lo que el gasto cardiaco aumenta a 20 litros por minuto y el volumen sanguíneo sufre variaciones. El volumen sanguíneo disminuye debido a la pérdida de líquidos y al paso de éstos hacia el espacio intersticial, con lo que el hematócrito sufre un incremento relativo. Vuelve a las condiciones normales durante los 30 minutos siguientes. Otros factores que influyen en el volumen son la temperatura corporal, la altitud sobre el nivel del mar, la cantidad de líquidos ingeridos y el grado de ejercicio o de reposo. Durante el embarazo existe un estado de hemodilución relativa que incrementa el volumen sanguíneo hasta en más de un litro.

      encontre esta informacion, favor de checar si es de utilidad

    1. So, at some point between about ten thousand years ago and about seven thousand years ago, there seems to have been a gradual shift in which some people in relatively abundant regions such as river valleys began first cultivating food plants near their camps of villages, and then in several regions began altering the plants to make them more productive.

      This is interesting to me because it shows how people slowly changed from just gathering plants to actually growing and improving them, which was the beginning of farming and a big step in human history.

    1. Reviewer #2 (Public review):

      Summary:

      This manuscript proposes that the use of a latent cause model for assessment of memory-based tasks may provide improved early detection in Alzheimer's Disease as well as more differentiated mapping of behavior to underlying causes. To test the validity of this model, the authors use a previously described knock-in mouse model of AD and subject the mice to several behaviors to determine whether the latent cause model may provide informative predictions regarding changes in the observed behaviors. They include a well-established fear learning paradigm in which distinct memories are believed to compete for control of behavior. More specifically, it's been observed that animals undergoing fear learning and subsequent fear extinction develop two separate memories for the acquisition phase and the extinction phase, such that the extinction does not simply 'erase' the previously acquired memory. Many models of learning require the addition of a separate context or state to be added during the extinction phase and are typically modeled by assuming the existence of a new state at the time of extinction. The Niv research group, Gershman et al. 2017, have shown that the use of a latent cause model applied to this behavior can elegantly predict the formation of latent states based on a Bayesian approach, and that these latent states can facilitate the persistence of the acquisition and extinction memory independently. The authors of this manuscript leverage this approach to test whether deficits in production of the internal states, or the inference and learning of those states, may be disrupted in knock-in mice that show both a build-up of amyloid-beta plaques and a deterioration in memory as the mice age.

      Strengths:

      I think the authors' proposal to leverage the latent cause model and test whether it can lead to improved assessments in an animal model of AD is a promising approach for bridging the gap between clinical and basic research. The authors use a promising mouse model and apply this to a paradigm in which the behavior and neurobiology are relatively well understood - an ideal situation for assessing how a disease state may impact both the neurobiology and behavior. The latent cause model has the potential to better connect observed behavior to underlying causes and may pave a road for improved mapping of changes in behavior to neurobiological mechanisms in diseases such as AD.<br /> The authors also compare the latent cause model to the Rescorla-Wagner model and a latent state model allowing for better assessment of the latent cause model as a strong model for assessing reinstatement.

      Weaknesses:

      I have several substantial concerns which I've detailed below. These include important details on how the behavior was analyzed, how the model was used to assess the behavior, and the interpretations that have been made based on the model.<br /> (1) There is substantial data to suggest that during fear learning in mice separate memories develop for the acquisition and extinction phases, with the acquisition memory becoming more strongly retrieved during spontaneous recovery and reinstatement. The Gershman paper, cited by the authors, shows how the latent causal model can predict this shift in latent causes by allowing for the priors to decay over time, thereby increasing the posterior of the acquisition memory at the time of spontaneous recovery. In this manuscript, the authors suggest a similar mechanism of action for reinstatement, yet the model does not appear to return to the acquisition memory after reinstatement, at least based on the simulation and examples shown in figures 1 and 3. More specifically, in figure 1, the authors indicate that the posterior probability of the latent cause, z<sub>A</sub> (the putative acquisition memory), increases, partially leading to reinstatement. This does not appear to be the case as test 3 (day 36) appears to have similar posterior probabilities for z<sub>A</sub> as well as similar weights for the CS as compared to the last days of extinction. Rather, the model appears to mainly modify the weights in the most recent latent cause, z<sub>B</sub> - the putative the 'extinction state', during reinstatement. The authors suggest that previous experimental data have indicated that spontaneous recovery or reinstatement effects are due to an interaction of the acquisition and extinction memory. These studies have shown that conditioned responding at a later time point after extinction is likely due to a balance between the acquisition memory and the extinction memory, and that this balance can shift towards the acquisition memory naturally during spontaneous recovery, or through artificial activation of the acquisition memory or inhibition of the extinction memory (see Lacagnina et al. for example). Here the authors show that the same latent cause learned during extinction, z<sub>B</sub>, appears to dominate during the learning phase of reinstatement, with rapid learning to the context - the weight for the context goes up substantially on day 35 - in z<sub>B</sub>. This latent cause, z<sub>B</sub>, dominates at the reinstatement test, and due to the increased associative strength between the context and shock, there is a strong CR. For the simulation shown in figure 1, it's not clear why a latent cause model is necessary for this behavior. This leads to the next point.

      (2) The authors compared the latent cause model to the Rescorla-Wagner model. This is very commendable, particularly since the latent cause model builds upon the RW model, so it can serve as an ideal test for whether a more simplified model can adequately predict the behavior. The authors show that the RW model cannot successfully predict the increased CR during reinstatement (Appendix figure 1). Yet there are some issues with the way the authors have implemented this comparison:<br /> (2A) The RW model is a simplified version of the latent cause model and so should be treated as a nested model when testing, or at a minimum, the number of parameters should be taken into account when comparing the models using a method such as the Bayesian Information Criterion, BIC.<br /> (2B) The RW model provides the associative strength between stimuli and does not necessarily require a linear relationship between V and the CR. This is the case in the original RW model as well as in the LCM. To allow for better comparison between the models, the authors should be modeling the CR in the same manner (using the same probit function) in both models. In fact, there are many instances in which a sigmoid has been applied to RW associative strengths to predict CRs. I would recommend modeling CRs in the RW as if there is just one latent cause. Or perhaps run the analysis for the LCM with just one latent cause - this would effectively reduce the LCM to RW and keep any other assumptions identical across the models.<br /> (2C) In the paper, the model fits for the alphas in the RW model are the same across the groups. Were the alphas for the two models kept as free variables? This is an important question as it gets back to the first point raised. Because the modeling of the reinstatement behavior with the LCM appears to be mainly driven by latent cause z<sub>B</sub>, the extinction memory, it may be possible to replicate the pattern of results without requiring a latent cause model. For example, the 12-month-old App NL-G-F mice behavior may have a deficit in learning about the context. Within the RW model, if the alpha for context is set to zero for those mice, but kept higher for the other groups, say alpha_context = 0.8, the authors could potentially observe the same pattern of discrimination indices in figure 2G and 2H at test. Because the authors don't explicitly state which parameters might be driving the change in the DI, the authors should show in some way that their results cannot simply be due to poor contextual learning in the 12 month old App NL-G-F mice, as this can presumably be predicted by the RW model. The authors' model fits using RW don't show this, but this is because they don't consider this possibility that the alpha for context might be disrupted in the 12-month-old App NL-G-F mice. Of course, using the RW model with these alphas won't lead to as nice of fits of the behavior across acquisition, extinction, and reinstatement as the authors' LCM, the number of parameters are substantially reduced in the RW model. Yet the important pattern of the DI would be replicated with the RW model (if I'm not mistaken), which is the important test for assessment of reinstatement.

      (3) As stated by the authors in the introduction, the advantage of the fear learning approach is that the memory is modified across the acquisition-extinction-reinstatement phases. Although perhaps not explicitly stated by the authors, the post-reinstatement test (test 3) is the crucial test for whether there is reactivation of a previously stored memory, with the general argument being that the reinvigorated response to the CS can't simply be explained by relearning the CS-US pairing, because re-exposure the US alone leads to increase response to the CS at test. Of course there are several explanations for why this may occur, particularly when also considering the context as a stimulus. This is what I understood to be the justification for the use of a model, such as the latent cause model, that may better capture and compare these possibilities within a single framework. As such, it is critical to look at the level of responding to both the context alone and to the CS. It appears that the authors only look at the percent freezing during the CS, and it is not clear whether this is due to the contextual-US learning during the US re-exposure or to increased responding to the CS - presumably caused by reactivation of the acquisition memory. The authors do perform a comparison between the preCS and CS period, but it is not clear whether this is taken into account in the LCM. For example, the instance of the model shown in figure 1 indicates that the 'extinction cause', or cause z6, develops a strong weight for the context during the reinstatement phase of presenting the shock alone. This state then leads to increased freezing during the final CS probe test as shown in the figure. If they haven't already, I think the authors must somehow incorporate these different phases (CS vs ITI) into their model, particularly since this type of memory retrieval that depends on assessing latent states is specifically why the authors justified using the latent causal model. In more precise terms, it's not clear whether the authors incorporate a preCS/ITI period each day the cue is presented as a vector of just the context in addition to the CS period in which the vector contains both the context and the CS. Based on the description, it seemed to me that they only model the CRs during the CS period on days when the CS is presented, and thereby the context is only ever modeled on its own (as just the context by itself in the vector) on extinction days when the CS is not presented. If they are modeling both timepoints each day that the CS I presented, then I would recommend explicitly stating this in the methods section.

      (4) The authors fit the model using all data points across acquisition and learning. As one of the other reviewers has highlighted, it appears that there is a high chance for overfitting the data with the LCM. Of course, this would result in much better fits than models with substantially fewer free parameters, such as the RW model. As mentioned above, the authors should use a method that takes into account the number of parameters, such as the BIC.

      (5) The authors have stated that they do not think the Barnes maze task can be modeled with the LCM. Whether or not this is the case, if the authors do not model this data with the LCM, the Barnes maze data doesn't appear valuable to the main hypothesis. The authors suggest that more sophisticated models such as the LCM may be beneficial for early detection of diseases such as Alzheimer's, so the Barnes maze data is not valuable for providing evidence of this hypothesis. Rather, the authors make an argument that the memory deficits in the Barnes maze mimic the reinstatement effects providing support that memory is disrupted similarly in these mice. Although, the authors state that the deficits in memory retrieval are similar across the two tasks, the authors are not explicit as to the precise deficits in memory retrieval in the reinstatement task - it's a combination of overgeneralizing latent causes during acquisition, poor learning rate, over differentiation of the stimuli.

    1. Bacterial cultures were first centrifuged at 15,000 x g for 10 min at 4 °C in a tabletop centrifuge (Heraeus Megafuge 16R, Thermo Scientific, Waltham, USA) to remove bacterial mass. The supernatants were filtered through 0.22 µm PES bottle top filters (Fisher Brand, New Hampshire, USA) and concentrated approximately 10-fold by tangential flow filtration (VivaFlow 200, 100 kDa cutoff, polyethersulfone, Sartorius, Göttingen, Germany). The concentrate was then ultracentrifuged for 2 h at 150,000 xg and 4 °C in a Beckman Optima LE-80K (Beckman Coulter, Brea, USA). The supernatant was discarded, and the pellet was washed twice in 1 mL ultrapure water. Afterward, the BMV pellet was resuspended in 1 mL ultrapure water and stored at 4 °C for a maximum of 1 week.

      Would you expect a difference in vessicle stability or selective depletion more labile vesicles (e.g., smaller/unsaturated or protein-rich)?

    1. Reviewer #4 (Public review):

      Summary

      In this study, López-Jiménez and colleagues demonstrate the utility of using high-content microscopy in dissecting host and bacterial determinants that play a role in the establishment of infection using Shigella flexneri as a model. The manuscript nicely identifies that infection with Shigella results in a block to DNA replication and protein synthesis. At the same time, the host responds, in part, via the entrapment of Shigella in septin cages.

      Strengths:

      The main strength of this manuscript is its technical aspects. They nicely demonstrate how an automated microscopy pipeline coupled with artificial intelligence can be used to gain new insights regarding elements of bacterial pathogenesis, using Shigella flexneri as a model system. Using this pipeline enabled the investigators to enhance the field's general understanding regarding the role of septin cages in responding to invading Shigella. This platform should be of interest to those who study a variety of intracellular microbial pathogens.

      Another strength of the manuscript is the demonstration - using cell biology-based approaches- that infection with Shigella blocks DNA replication and protein synthesis. These observations nicely dovetail with the prior findings of other groups. Nevertheless, their clever click-chemistry-based approaches provide visual evidence of these phenomena and should interest many.

      Weaknesses:

      There are two main weaknesses of this work. First, the studies are limited to findings obtained using a single immortalized cell line. It is appreciated that HeLa cells serve as an excellent model for studying aspects of Shigella pathogenesis and host responses. However, it would be nice to see that similar observations are observed with an epithelial cell line of intestinal, preferably colonic origin, and eventually, with a non-immortalized cell line, although it is appreciated that the latter studies are beyond the scope of this work.

      The other weakness is that the studies are minimally mechanistic. For example, the investigators have data to suggest that infection with Shigella leads to an arrest in DNA replication and protein synthesis; however, no follow-up studies have been conducted to determine how these host cell processes are disabled. Interestingly, Zhang and colleagues recently identified that the Shigella OspC effectors target eukaryotic translation initiation factor 3 to block host cell translation (PMID: 38368608).

    2. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary:

      In this study, López-Jiménez and colleagues demonstrated the utility of using high-content microscopy in dissecting host and bacterial determinants that play a role in the establishment of infection using Shigella flexneri as a model. The manuscript nicely identifies that infection with Shigella results in a block to DNA replication and protein synthesis. At the same time, the host responds, in part, via the entrapment of Shigella in septin cages.

      Strengths:

      The main strength of this manuscript is its technical aspects. They nicely demonstrate how an automated microscopy pipeline coupled with artificial intelligence can be used to gain new insights regarding elements of bacterial pathogenesis, using Shigella flexneri as a model system. Using this pipeline enabled the investigators to enhance the field's general understanding regarding the role of septin cages in responding to invading Shigella. This platform should be of interest to those who study a variety of intracellular microbial pathogens.

      Another strength of the manuscript is the demonstration - using cell biology-based approaches- that infection with Shigella blocks DNA replication and protein synthesis. These observations nicely dovetail with the prior findings of other groups. Nevertheless, their clever click-chemistry-based approaches provide visual evidence of these phenomena and should interest many.

      We thank the Reviewer for their enthusiasm on technical aspects of this paper, regarding both the automated microscopy pipeline coupled with artificial intelligence and the click-chemistry based approaches to dissect DNA replication and protein synthesis by microscopy.

      Weaknesses:

      There are two main weaknesses of this work. First, the studies are limited to findings obtained using a single immortalized cell line. It is appreciated that HeLa cells serve as an excellent model for studying aspects of Shigella pathogenesis and host responses. However, it would be nice to see that similar observations are observed with an epithelial cell line of intestinal, preferably colonic origin, and eventually, with a non-immortalized cell line, although it is appreciated that the latter studies are beyond the scope of this work.

      The immortalized cell line HeLa is widely regarded as a paradigm to study infection by Shigella and other intracellular pathogens. However, we agree that future studies beyond the scope of this work should include other cell lines (eg. epithelial cells of colonic origin, macrophages, primary cells). 

      The other weakness is that the studies are minimally mechanistic. For example, the investigators have data to suggest that infection with Shigella leads to an arrest in DNA replication and protein synthesis; however, no follow-up studies have been conducted to determine how these host cell processes are disabled. Interestingly, Zhang and colleagues recently identified that the Shigella OspC effectors target eukaryotic translation initiation factor 3 to block host cell translation (PMID: 38368608). This paper should be discussed and cited in the discussion.

      We appreciate the Reviewer’s concern about the lack of follow up work on observations of host DNA and protein synthesis arrest upon Shigella infection, which will be the focus of future studies. We acknowledge the recent work of Zhang et al. (Cell Reports, 2024) considering their similar results on protein translation arrest, and this reference has been more fully discussed in the revised version of the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Septin caging has emerged as one of the innate immune responses of eukaryotic cells to infections by intracellular bacteria. This fascinating assembly of eukaryotic proteins into complex structures restricts bacteria motility within the cytoplasm of host cells, thereby facilitating recognition by cytosolic sensors and components of the autophagy machinery. Given the different types of septin caging that have been described thus far, a single-cell, unbiased approach to quantify and characterise septin recruitment at bacteria is important to fully grasp the role and function of caging. Thus, the authors have developed an automated image analysis pipeline allowing bacterial segmentation and classification of septin cages that will be very useful in the future, applied to study the role of host and bacterial factors, compare different bacterial strains, or even compare infections by clinical isolates.

      Strengths:

      The authors developed a solid pipeline that has been thoroughly validated. When tested on infected cells, automated analysis corroborated previous observations and allowed the unbiased quantification of the different types of septin cages as well as the correlation between caging and bacterial metabolic activity. This approach will prove an essential asset in the further characterisation of septin cages for future studies.

      We thank the Reviewer for their positive comments, and for highlighting the strength of our imaging and analysis pipeline to analyse Shigella-septin interactions.

      Weaknesses:

      As the main aim of the manuscript is to describe the newly developed analysis pipeline, the results illustrated in the manuscript are essentially descriptive. The developed pipeline seems exceptionally efficient in recognising septin cages in infected cells but its application for a broader purpose or field of study remains limited.

      The main objective of this manuscript is the development of imaging and analysis tools to study Shigella infection, and in particular, Shigella interactions with the septin cytoskeleton. In future work we will provide more mechanistic insight with novel experiments and broader applicability, using different cell lines (in agreement with Reviewer 1), mutants or clinical isolates of Shigella and different bacteria species (eg. Listeria, Salmonella, mycobacteria).

      Reviewer #3 (Public Review):

      Summary:

      The manuscript uses high-content imaging and advanced image-analysis tools to monitor the infection of epithelial cells by Shigella. They perform some analysis on the state of the cells (through measurements of DNA and protein synthesis), and then they focus on differential recruitment of Sept7 to the bacteria. They link this recruitment with the activity of the bacterial T3SS, which is a very interesting discovery. Overall, I found numerous exciting elements in this manuscript, and I have a couple of reservations. Please see below for more details on my reservations. Nevertheless, I think that these issues can be addressed by the authors, and doing so will help to make it a convincing and interesting piece for the community working on intracellular pathogens. The authors should also carefully re-edit their manuscript to avoid overselling their data (see below for issues I see there). I would consider taking out the first figure and starting with Figure 3 (Figure 2 could be re-organized in the later parts)- that could help to make the flow of the manuscript better.

      Strengths:

      The high-content analysis including the innovative analytical workflows are very promising and could be used by a large number of scientists working on intracellular bacteria. The finding that Septins (through SEPT7) are differentially regulated through actively secreting bacteria is very exciting and can steer novel research directions.

      We thank the Reviewer for their constructive feedback and excitement for our results, including our findings on T3SS activity and Shigella-septin interactions. In accordance with the Reviewer’s comments, we avoid overselling our data in the revised version of the manuscript.

      Weaknesses:

      The manuscript makes a connection between two research lines (1: Shigella infection and DNA/protein synthesis, 2: regulation of septins around invading Shigella) that are not fully developed - this makes it sometimes difficult to understand the take-home messages of the authors.

      We agree that the manuscript is mostly technical and therefore some of our experimental observations would benefit from follow up mechanistic studies in the future. We highlight our vision for broader applicability in response to weaknesses raised by Reviewer 2.

      It is not clear whether the analysis that was done on projected images actually reflects the phenotypes of the original 3D data. This issue needs to be carefully addressed.

      We agree with the Reviewer that characterizing 3D data using 2D projected images has limitations.

      We observe an increase in cell and nuclear surface that does not strictly imply a change in volume. This is why we measure Hoechst intensity in the nucleus using SUM-projection (as it can be used as a proxy of DNA content of the cell). However, we agree that future use of other markers (such as fluorescently labelled histones) would make our conclusions more robust.

      Regarding the different orientation of intracellular bacteria, we agree that investigation of septin recruitment is more challenging when bacteria are placed perpendicular to the acquisition plane. In a first step, we trained a Convolutional Neural Network (CNN) using 2D data, as it is easier/faster to train and requires fewer annotated images. In doing so, we already managed to correctly identify 80% of Shigella interacting with septins, which enabled us to observe higher T3SS activity in this population. In future studies, we will maximize the 3D potential of our data and retrain a CNN that will allow more precise identification of Shigella-septin interactions and in depth characterization of volumetric parameters.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) To conclude that cell volume is indeed increased, the investigators should consider staining the cells with markers that demarcate cell boundaries and/or are confined to the cytosol, i.e., a cell tracker dye.

      Staining using our SEPT7 antibody enables us to define cell boundaries for cellular area measurements (Novel Figure 1 - figure supplement 1A). However, we agree with the Reviewer that staining cells with additional markers (such as a cell tracker dye) would be required to conclude that cell volume is increased. We therefore adjust our claims in the main text (lines 107-115 and 235-246).

      (2) Line 27: I understand what is meant by "recruited to actively pathogenic bacteria with increased T3SS activation." However, one could argue that there are many different roles of the intracytosolic bacteria in pathogenesis in terms of pathogenesis, not just actively secreting effectors.

      T3SS secretion by cytosolic bacteria is tightly regulated and both T3SS states (active, inactive) likely contribute to the pathogenic lifestyle of S. flexneri. In agreement with this, we removed this statement from the manuscript (lines 27, 225 and 274).

      (3) Line 88: Please clarify in the text that HeLa cells are being studied.

      We explicitly mention that the epithelial cell line we study is HeLa in the main text (line 93), in addition to the Materials and methods (line 328).

      (4) Line 97: is it possible to quantify the average distance of the nuclei from the cell perimeter? This would help provide some context as to what it means to be a certain distance from the nucleus, i.e., is there another way to point out that distance from nuclei correlates with movement inward post-invasion at the periphery?

      To provide more context to the inward movement of bacteria to the cell centre, we provide calculations based on measurements in Figure 1G, I. If we approximate geometric shape of both cells and nucleus to a circle, the median radius of a HeLa cell is 31.1 µm<sup>2</sup> (uninfected cell) and 36.3 µm<sup>2</sup> (infected cell). Similarly, the median radius of the nucleus is 22.2 µm<sup>2</sup> (uninfected cell) and 24.57 µm<sup>2</sup> (infected cell).

      However, we note that Figure 1F shows distance of bacteria to the centroid of the cell, which is the geometric centre of the cell, and which does not necessarily coincide with the geometric centre of the nucleus. We also note that nuclear area increases with infection (in a bacterial dose dependent manner). Finally, we note that these measurements are performed on max projections of 3D Z-stacks. In this case we cannot fully appreciate distance to the nucleus for bacteria located above it.

      (5) Lines 212-213 - there is no Figure 9A, B - I think this should be Figure 7A, B.

      Text has been updated (lines 216-217).

      Reviewer #2 (Recommendations For The Authors):

      Testing the analysis pipeline as a proof-of-concept question such as the comparison of caging around the laboratory strain as compared to one or a few clinical isolates or mutants of interest would help stress the relevance of this new, remarkable tool.

      We thank the Reviewer for their enthusiasm.

      Future research in the Mostowy lab will capitalise on the high-content tools generated here to explore the frequency and heterogeneity of septin cage entrapment for a wide variety of S. flexneri mutants and Shigella clinical isolates.

      The sentence in line 215 ends with "in agreement with" followed by a reference.

      Text has been updated (line 219).

      The sentence in line 217 on the correlation between caging and T3SS is not very clear.

      Text has been clarified (lines 221-223).

      There is a typo in line 219 : "protrusSions"

      Text has been updated (line 223).

      Reviewer #3 (Recommendations For The Authors):

      Major points

      The quantitative analysis approach in Figure 1 has multiple issues. Some examples:<br /> (1) How was the cell area estimated? Normally, a marker for the whole cell (CellMask or similar) or cells expressing GFP would be good indicators. Here it is not clear to me what was done.

      The cell area was estimated using SEPT7 antibody staining which is enriched under the cell cortex. CellProfiler was used to segment cells based on SEPT7 staining, using a propagation method from the identified nucleus based on Otsu thresholding. To provide more clarity on how this was performed, we now include a new figure (Figure 1- figure supplement 1A) showing a representative image of HeLa cells stained with SEPT7 and the corresponding cell segmentation performed with CellProfiler software, together with an updated figure legend explaining the procedure (lines 784–787).

      (2) The authors use Hoechst and integrated z-projections (Figure 1 S1) as a proxy to estimate nuclear volume. Hoechst staining depends on the organization of the DNA within the nucleus and I find that the authors need to do better controls to estimate nuclear size - this would be possible with cells expressing fluorescently labeled histones, or even better with a fluorescently tagged nuclear pore/envelope marker. The current quantification approach is misleading.

      We understand Reviewer #3’s concerns about using Hoechst staining as a proxy of nuclear volume, due to potential differences in DNA organisation within the nucleus.

      Following the recommendation of Reviewer #3 in the following point 3, text has been updated (lines 107–115 and 235-246).

      (3) Was cell density assessed for the measurements? If cells are confluent, bacteria could spread between cells within 3 hrs, if cells are less dense, this does not occur. When epithelial cells are infected for some hours, they have the tendency to round up a bit (and to appear thicker in z), but a bit smaller in xy. My suggestion to the authors (as they use these findings to follow up with experiments on the underlying processes) would be to tone down their statements - eg, Hoechst staining could be simply indicated as altered, but not put in a context of size (this would require substantial control experiments).

      Local cell density was not directly measured, but the experiment was set up to infect at roughly 80% confluency (cells were seeded at 10<sup>4</sup> cells/well 2 days prior to infection in a 96-well microplate, as described in the Materials and methods section) and to ensure bacterial spread between cells.

      In agreement with Reviewer #3 we tone down statements in the main text (see response to point 2 above).

      In addition, I found Figure 1 (and parts of Figure 2) disconnected from the rest of the manuscript, and it may even be an idea to take it out of the manuscript (that could also help to deal with my feedback relating to Figure 1). I would suggest starting the manuscript with the current Figure 3 and building the biological story with a stronger focus on SEPT7 (and its links with T3 secretion and actively pathogenic bacteria) from there on. As it stands, the two parts of the manuscript are not well connected.

      We carefully considered this comment but following revisions we have not reorganised the manuscript. We believe that high-content characterisation of S. flexneri infection in Figure 1 and 2 provides insightful information about changes in host cells in response to infection. Following this, we move onto characterising intracellular bacteria (and in particular those entrapped in septin cages) in the second part of the manuscript (Figure 3-7). Similar methods were used to analyse both host and bacterial cells and results obtained offer complementary views on host-pathogen interactions.

      My major reservation with the experimental work of the current version of the manuscript relates to Figure 5: The analysis of the septin phenotypes in Figure 5 seems to be problematic - to me, it appears that analysis and training were done on projected image stacks. As bacteria are rod-shaped their orientation in space has an enormous impact on how the septin signal appears in a projection - this can lead to wrong interpretation of the phenotypes. The authors need to do some quantitative controls analyzing their data in 3D. To be more clear: the example "tight" (second row) shows a bacterium that appears short. It may be that it's actually longer if one looks in 3D, and the septin signal could possibly fall in the category "rings" or even "two poles".

      The deep learning training and subsequent analysis of septin-cage entrapment is done on projected Z-stacks, which presents limitations. Future work in the Mostowy lab will exploit this first study and dive deeper into 3D aspects of the data.

      To address Reviewer #3’s concern, we include a sentence explaining that this analysis was performed using 2D max projections (lines 708 and 724), as well as acknowledging its limitations in the main text (lines 259-262).

      Minor points

      The scale bar in Fig 1 is very thin.

      We corrected the scale bar in Fig. 1 to make it more visible.

      Could it be that Figure 1F is swapped with Figure1E in the description?

      Descriptions for Figure 1E and F are correct.

      Line 27: what does "actively pathogenic bacteria" mean? I propose to change the term.

      We agree with Reviewer #3 that “actively pathogenic bacteria” should be removed from the text. This update is also in agreement with Reviewer #1 (see Reviewer #1 point 2).

      Line 28: "dynamics" can be confusing as it relates to dynamic events imaged by time-lapse.

      Although we are making a snapshot of the infection process at 3 hpi, we capture asynchronous processes in both host and bacterial cells (eg. host cells infected with different bacterial loads, bacterial cells undergoing actin polymerisation or septin cage entrapment). We agree that we are not following dynamics of full events over time. However, our high content approach enables us to capture different stages of dynamic processes. To avoid confusion, we replace “dynamics” by “diverse interactions” (line 28), and we discuss the importance of follow-up studies studying microscopy timelapses (line 274).

      Paragraph 59 following: the concept of heterogeneity was investigated in some detail for viral infection by the Pelkmans group (PMID: 19710653) using advanced image analysis tools. Advanced machine-learning-based analysis was then performed on Salmonella invasion by Voznica and colleagues (PMID: 29084895). It would be great to include these somewhat "old" works here as they really paved the way for high-content imaging, and the way analyses were performed then should be also discussed in light of how analyses can be performed now with the approaches developed by the authors.

      We agree. These landmark studies have now been included in the main text (lines 71-74).

      Line 181: I do not know what "morphological conformations" means, perhaps the authors can change the wording or clarify.

      We substituted the phrase “morphological conformations” by “morphological patterns” to improve clarity in the main text (lines 185).

      The authors claim (eg in the abstract) that they are measuring the dynamic infection process. To me, it appears that they look at one time-point, so no dynamic information can be extracted. I suggest that the authors tone down their claims.

      Please note our response above (Minor points, Line 28) which also refers to this question.

    1. Reviewer #1 (Public review):

      This is my first review of this manuscript. The authors included previous reviews for a different journal with a length of 90 and 39 pages; I did not review this reply in my assessment of the paper itself. Influenza prediction is not my area of expertise.

      A major concern is that the model is trained in the midst of the COVID-19 pandemic and its associated restrictions and validated on 2023 data. The situation before, during, and after COVID is fluid, and one may not be representative of the other. The situation in 2023 may also not have been normal and reflective of 2024 onward, both in terms of the amount of testing (and positives) and measures taken to prevent the spread of these types of infections. A further worry is that the retrospective prospective split occurred in October 2020, right in the first year of COVID, so it will be impossible to compare both cohorts to assess whether grouping them is sensible.

      The outcome of interest is the number of confirmed influenza cases. This is not only a function of weather, but also of the amount of testing. The amount of testing is also a function of historical patterns. This poses the real risk that the model confirms historical opinions through increased testing in those higher-risk periods. Of course, the models could also be run to see how meteorological factors affect testing and the percentage of positive tests. The results only deal with the number of positive (only the overall number of tests is noted briefly), which means there is no way to assess how reasonable and/or variable these other measures are. This is especially concerning as there was massive testing for respiratory viruses during COVID in many places, possibly including China.

      (1) Although the authors note a correlation between influenza and the weather factors. The authors do not discuss some of the high correlations between weather factors (e.g., solar radiation and UV index). Because of the many weather factors, those plots are hard to parse.

      (2) The authors do not actually compare the results of both methods and what the LSTM adds.

      Minor comments:

      (3) The methods are long and meandering. They could be cleaned up and shortened. E.g., there is no need for 30 lines on PCR testing; the study area should come before the study design. The authors discuss similar elements in multiple places; this whole section can be shortened considerably without affecting the content.

      (4) How reliable is the "Our Word in Data" website for subnational coverage of restrictions? Some of the authors are from Putian and should be able to confirm the accuracy for both studied areas.

      (5) Figure 2A is hard to parse; it would make more sense to plot these as line plots (y=count, x=month).

    2. Reviewer #2 (Public review):

      Summary:

      The study aimed to assess the associations between meteorological drivers and influenza is important although not new. The authors used only 6 years of surveillance data and deep learning models, combining distributed lag non-linear models (DLNM) with Bayesian-optimized LSTM neural networks for predictive modeling. The key interest in this area is to explore the subtropical locations, where influenza is less common and circulates year-round. The authors further claimed that such an association could be able to provide an early warning in the community. In this direction, the current manuscript has several scopes of improvements and clarification of the claims, as I list here.

      Strengths:

      Study design based on a prospective cohort to analyse the data for retrospective outcomes.

      Weaknesses:

      (1) The rationale of the study is not clearly stated.

      (2) Several issues with methodological and data integration should be clarified.

      (3) Validation of the models is not presented clearly.

      (4) The claim for providing tools for 'early warning' was not validated by analysis and results.

    3. Author response:

      Reviewer # 1 (Public review):

      A major concern is that the model is trained in the midst of the COVID-19 pandemic and its associated restrictions and validated on 2023 data. The situation before, during, and after COVID is fluid, and one may not be representative of the other. The situation in 2023 may also not have been normal and reflective of 2024 onward, both in terms of the amount of testing (and positives) and measures taken to prevent the spread of these types of infections. A further worry is that the retrospective prospective split occurred in October 2020, right in the first year of COVID, so it will be impossible to compare both cohorts to assess whether grouping them is sensible.

      We fully concur with the reviewer that the COVID-19 pandemic represents a profound confounding factor that fundamentally impacts the interpretation and generalizability of our model. This is a critical point that deserves a more thorough treatment. In the revised manuscript, we will add a dedicated subsection in the Discussion to explicitly analyze the pandemic’s impact. We will reframe our model’s contribution not as a universally generalizable tool for a hypothetical “normal” future, but as a robust framework demonstrated to capture complex epidemiological dynamics under the extreme, non-stationary conditions of a real-world public health crisis. We will argue that its strong performance on the 2023 validation data, a unique post-NPI “rebound” year, specifically showcases its utility in modeling volatile periods.

      The outcome of interest is the number of confirmed influenza cases. This is not only a function of weather, but also of the amount of testing. The amount of testing is also a function of historical patterns. This poses the real risk that the model confirms historical opinions through increased testing in those higher-risk periods. Of course, the models could also be run to see how meteorological factors affect testing and the percentage of positive tests. The results only deal with the number of positive (only the overall number of tests is noted briefly), which means there is no way to assess how reasonable and/or variable these other measures are. This is especially concerning as there was massive testing for respiratory viruses during COVID in many places, possibly including China.

      The reviewer raises a crucial point regarding surveillance bias, which is inherent in studies using reported case data. We acknowledge this limitation and will address it more transparently.

      (1) Clarification of Available Data: Our manuscript states that over the six-year period, a total of 20,488 ILI samples were tested, yielding 3,155 positive cases (line 471; Figure 1). We will make this denominator more prominent in the Methods section. However, the reviewer is correct that our models for Putian and the external validation for Sanming utilize the daily positive case counts as the outcome. The reality of our surveillance data source is that while we have the aggregate total of tests over six years, obtaining a reliable daily denominator of all respiratory virus tests conducted (not just for ILI patients as per the surveillance protocol) is not feasible. This is a common constraint in real-world public health surveillance systems.

      (2) Justification and Discussion: We will add a detailed paragraph to the Limitations section to address this. We will justify our use of case counts as it is the most direct metric for assessing public health burden and planning resource allocation (e.g., hospital beds, antivirals). We will also explain that modeling the positivity rate presents its own challenges, as the ILI denominator is also subject to biases (e.g., shifts in healthcare-seeking behavior, co-circulation of other pathogens causing similar symptoms). We will thus frame our work as forecasting the direct surveillance signal that public health officials monitor daily.

      Although the authors note a correlation between influenza and the weather factors. The authors do not discuss some of the high correlations between weather factors (e.g., solar radiation and UV index). Because of the many weather factors, those plots are hard to parse.

      This is an excellent point. Our preliminary analysis (Supplementary Figure S2) indeed confirms a strong positive correlation between solar radiation and the UV index. Perhaps the reviewer overlooked the contents of the supplementary information document. We have included the figure for their review. Our original discussion did explicitly address this multicollinearity, summarized as follows: We acknowledge the high correlation between certain meteorological variables. We then explain that our two-stage modeling approach is designed to mitigate this issue. In the first stage, the DLNM models assess the impact of each variable individually, thus isolating their non-linear and lagged effects without being confounded by interactions. In the second stage, the LSTM network, by its nature, is a powerful non-linear function approximator that is robust to multicollinearity and can learn the complex, interactive relationships between all input features, including correlated ones.

      Figure S2. Scatterplot matrix illustrating correlations between Influenza cases and meteorological factors. This comprehensive scatterplot matrix visualizes the relationships between influenza-like illness (ILI) cases, influenza A and B cases, and multiple meteorological variables, including average temperature, humidity, precipitation, wind speed, wind direction, solar radiation, and ultraviolet (UV) index. The figure is composed of three distinct sections that collectively provide an in-depth analysis of these relationships:

      (1) Upper-right triangle: This section presents a Pearson correlation coefficient matrix, with color intensity reflecting the strength of correlations between the variables. Red cells represent positive correlations, while green cells represent negative correlations. The closer the coefficient is to 1 or -1, the darker the cell and the stronger the correlation, with statistically significant correlations marked by asterisks. This matrix allows for a rapid identification of notable relationships between influenza cases and meteorological factors.

      (2) Lower-left triangle: This section contains scatterplots of pairwise comparisons between variables. These scatterplots facilitate the visual identification of potential linear or non-linear relationships, as well as any outliers or anomalies. This visualization is essential for evaluating the nature of interactions between meteorological factors and influenza cases.

      (3) Diagonal: The diagonal displays the density distribution curves for each individual variable. These curves provide an overview of the distribution characteristics of each variable, revealing central tendencies, variance, and any skewness present in the data.

      The authors do not actually compare the results of both methods and what the LSTM adds.

      We thank the reviewer for this comment and realize we may not have signposted the comparison clearly enough. Our manuscript does present a direct comparison between the LSTM and ARIMA models in the Results section (lines 737-745) and Table 2, where performance metrics (MAE, RMSE, MAPE, SMAPE) for both models on the 2023 validation set are detailed, showing LSTM’s superior performance, particularly for Influenza A. Furthermore, Figure 6 (panels A and B) visualizes the LSTM’s predictions against observed values, and Supplementary Figure S3 does the same for the ARIMA model, allowing for a visual comparison of their fit.

      To address the reviewer’s concern, in the revised manuscript, we will:

      (1) Add a more explicit comparative statement in the Results section, directly contrasting the key metrics and highlighting the LSTM’s advantages in capturing peak activities.

      (2) Consider combining the visualizations from Figure 6 and Supplementary Figure S3 into a single, more powerful comparative figure that shows the observed data, the LSTM predictions, and the ARIMA predictions on the same plot.

      Meandering methods; reliability of “Our Word in Data”; Figure 2A is hard to parse.

      We will address these points comprehensively.

      (3) Methods: We will significantly streamline and restructure the Methods section. We also wish to provide context that the manuscript’s current structure reflects an effort to incorporate feedback from multiple rounds of peer review across different journals, which may have led to some repetition. We will perform a thorough edit to improve its conciseness and logical flow.

      (4) Data Reliability: The reviewer raises a crucial and highly insightful question regarding the validity of using a national-level index to represent local public health interventions. This is a critical aspect of our model’s construction, and we are grateful for the opportunity to provide a more thorough justification.

      We acknowledge that the ideal variable would be a daily, quantitative, city-level index of non-pharmaceutical interventions (NPIs). However, the practical reality of the data landscape in China is that such granular, publicly accessible databases for subnational regions do not exist. Given this constraint, our choice of the Our World in Data (OWID) national stringency index was the result of a careful consideration process, and we believe it serves as the best available proxy for our study context.

      In the revised manuscript, we will significantly expand the Methods section to articulate our rationale, which is threefold:

      National Policy Coherence: During the COVID-19 pandemic in mainland China, core NPIs, particularly mandatory face-covering policies in shared public spaces, were implemented with a high degree of national uniformity. While local governments had some autonomy, they operated within a centrally defined framework, ensuring a baseline level of policy consistency across the country.

      Local Context Alignment: A key factor supporting the use of this national proxy is the specific epidemiological context of Putian during the study period. For the vast majority of the pandemic, Putian was classified as a low-risk area with only sporadic COVID-19 cases. Consequently, the city’s public health measures consistently aligned with the standard national guidelines. It did not experience prolonged or exceptionally strict local lockdowns that would cause a significant deviation from the national-level policy trends captured by the OWID index.

      Validation by Local Public Health Experts: Most critically, and to directly address your suggestion, our co-authors from the Putian Center for Disease Control and Prevention have meticulously reviewed the OWID stringency index against their on-the-ground, institutional knowledge of the mandates that were in effect. They have confirmed that the categorical levels (0-4) and the temporal trends of the OWID index provide a faithful representation of the public health restrictions concerning face coverings as experienced by the population of Putian.

      Therefore, we will revise our manuscript to make it clear that the use of the OWID index was not a choice of convenience, but a necessary and well-vetted decision. Given the unavailability of official local data, the OWID index, cross-validated by our local experts, represents the most rigorous and appropriate variable available to account for the profound impact of NPIs on influenza transmission in our model.

      (5) Figure 2A: We agree completely and will replace the heatmap with a multi-line plot or a stacked area chart to better visualize the temporal dynamics of influenza subtypes.

      We have preliminarily completed the redrawing of Figure 3A. The new and old versions are presented for your review to determine which figure is more suitable for this manuscript in terms of scientific accuracy and visual impact.

      Reviewer #2 (Public review):

      Weakness (1):

      The rationale of the study is not clearly stated.

      We appreciate the reviewer’s critique and acknowledge that the unique contribution of our study needs to be articulated more forcefully. Our introduction (lines 105-140) attempted to outline the limitations of existing studies, but we will revise it to be much sharper. The revised introduction will state unequivocally that our study’s rationale is to address a confluence of specific, unresolved gaps in the literature: 1) The persistent challenge of forecasting influenza in subtropical regions with their erratic seasonality; 2) The lack of studies that build subtype-specific models for Influenza A and B, which we show have distinct meteorological drivers; 3) The methodological gap in integrating the explanatory power of DLNM with the predictive power of a rigorously, Bayesian-optimized LSTM network; and 4) The unique opportunity to develop and test a model on data that encompasses the unprecedented disruption of the COVID-19 pandemic, a critical test of model robustness.

      Weakness (2):

      Several issues with methodological and data integration should be clarified.

      We interpret this as a general statement, with the specific issues detailed in the reviewer’s subsequent points and the “Recommendations for the authors” section. We will meticulously address each of these specific points in our revision. For instance, as a demonstration of our commitment to clarification, we will provide a much more detailed justification for our choice of benchmark model (ARIMA), as detailed in our response to Recommendation #11.

      Reviewer #2 (Recommendation  for the authors):

      The authors should justify why the baseline model selection was made by comparing the LSTM model only with ARIMA? How the outcomes could be sensitive to other commonly used machine learning methods, such as Random Forest or XGBoost, etc, as a benchmark for their performance.

      The reviewer raises a highly pertinent question regarding the selection of our benchmark model. A robust comparison is indeed essential for contextualizing the performance of our proposed LSTM network. Our choice to benchmark against the ARIMA model was a deliberate and principled decision, grounded in the specific literature of influenza forecasting at the intersection of climatology and epidemiology.

      In the revised manuscript, we will expand our justification within the Methods section and reinforce it in the Discussion. Our rationale is as follows:

      (1) ARIMA as the Established Standard: As we briefly noted in our original introduction (lines 110-113), the ARIMA model is arguably the most widely established and frequently cited statistical method for time-series forecasting of influenza incidence, including studies investigating meteorological drivers. It serves as the conventional benchmark against which novel methods in this specific domain are often evaluated. Therefore, demonstrating superiority over ARIMA is the most direct and scientifically relevant way to validate the incremental value of our deep learning approach.

      (2) A Focused Scientific Hypothesis: Our primary hypothesis was that the LSTM network, with its inherent ability to capture complex non-linearities and long-term dependencies, could overcome the documented limitations of linear autoregressive models like ARIMA in the context of climate-influenza dynamics. Our study was designed specifically to test this hypothesis.

      (3) Avoiding a “Bake-off” without a Clear Rationale: While other machine learning models like Random Forest or XGBoost are powerful, they are not established as the standard baseline in this particular niche of literature. Including them would shift the focus from a targeted comparison against the conventional standard to a broader, less focused “bake-off” of various algorithms. Such an exercise, while potentially interesting, would risk diluting the core message of our paper and would be undertaken without a clear, literature-driven hypothesis for why one of these specific tree-based models should be the next logical benchmark.

      Therefore, we will argue in the revised manuscript that our focused comparison with ARIMA provides the clearest and most meaningful assessment of our model’s contribution to the existing body of work on climate-informed influenza forecasting. We will, however, explicitly acknowledge in the Discussion that future work could indeed benefit from a broader comparative analysis as the field continues to evolve and adopt a wider array of machine learning techniques.

      Similarly, for some of the reviewer’s recommendations that do not require significant time and effort to implement, such as recommendation 7, we have also redrawn Figure 3 based on your feedback. It is provided for your review.

      Figure 3 presents the time series of the cases. I wonder whether the data for these factors and outcomes are daily or aggregated by week/month? I suggest representing it in 9x1 format with a single x-axis to compare, instead of 3x3 format. Authors can refer similar plot in https://doi.org/ 10.1371/journal.pcbi.1012311 in Figure 1.

      We are deeply grateful for the reviewer’s valuable suggestion and thoughtful provision of reference illustrations. Based on their input, we have redrawn Figure 3 and have included it for their review.

      Weakness (3):

      Validation of the models is not presented clearly.

      We were concerned by this comment and conducted a thorough self-assessment of our manuscript. We believe we have performed a multi-faceted validation, but we have evidently failed to present it with sufficient clarity and structure. Our validation strategy, detailed across the Methods and Results sections, includes:

      • Internal Out-of-Time Validation: Using 2023 data as a hold-out set to test the model trained on 2018-2022 data (lines 695-696, 705-710; Figure 6A, B).

      • External Validation: Testing the trained model on an independent dataset from a different city, Sanming (lines 730-736; Figure 6I, J).

      • Benchmark Model Comparison: Quantitatively comparing the LSTM’s performance against the standard ARIMA model using multiple error metrics (lines 737-745; Table 2).

      • Interpretability Validation (Sanity Check): Using SHAP analysis to ensure the model’s predictions are driven by epidemiologically plausible factors (lines 746-755; Figure 6E-H).

      To address the reviewer’s valid critique of our presentation, we will significantly restructure the relevant parts of the Results section. We will create explicit subheadings such as “Internal Validation,” “External Validation,” and “Comparative Performance against ARIMA Benchmark” to make our comprehensive validation process unambiguous and easy to follow.

      Weakness (4):

      The claim for providing tools for 'early warning' was not validated by analysis and results.

      We agree with this assessment entirely. This aligns with the eLife Assessment and comments from Reviewer #1. Our primary revision will be to systematically recalibrate the manuscript's language. We will replace all instances of “early warning tool” with more accurate and modest phrasing, such as “high-performance forecasting framework” or “a foundational model for future warning systems.” We will ensure that our revised title, abstract, and conclusions precisely reflect what our study has delivered: a robust predictive model, not a field-ready public health intervention tool.

    1. Reviewer #1 (Public review):

      Summary:

      Nahas et al. investigated the roles of herpes simplex virus 1 (HSV-1) structural proteins using correlative cryo-light microscopy and soft X-ray tomography. The authors generated nine viral variants with deletions or mutations in genes encoding structural proteins. They employed a chemical fixation-free approach to study native-like events during viral assembly, enabling observation of a wider field of view compared to cryo-ET. The study effectively combined virology, cell biology, and structural biology to investigate the roles of viral proteins in virus assembly and budding.

      Strengths:

      (1) The study presented a novel approach to studying viral assembly in cellulo.

      (2) The authors generated nine mutant viruses to investigate the roles of essential proteins in nuclear egress and cytoplasmic envelopment.

      (3) The use of correlative imaging with cryoSIM and cryoSXT allowed for the study of viral assembly in a near-native state and in 3D.

      (4) The study identified the roles of VP16, pUL16, pUL21, pUL34, and pUS3 in nuclear egress.

      (5) The authors demonstrated that deletion of VP16, pUL11, gE, pUL51, or gK inhibits cytoplasmic envelopment.

      (6) The manuscript is well-written, clearly describing findings, methods, and experimental design.

      (7) The figures and data presentation are of good quality.

      (8) The study effectively correlated light microscopy and X-ray tomography to follow virus assembly, providing a valuable approach for studying other viruses and cellular events.

      (9) The research is a valuable starting point for investigating viral assembly using more sophisticated methods like cryo-ET with FIB-milling.

      (10) The study proposes a detailed assembly mechanism and tracks the contributions of studied proteins to the assembly process.

      (11) The study includes all necessary controls and tests for the influence of fluorescent proteins.

      Weaknesses:

      Overall, the manuscript does not have any major weaknesses, just a few minor comments, which were mostly solved in the revised version of the manuscript.

      Comments on the latest version:

      I reviewed the responses and the updated manuscript, and I am very pleased with how the authors have revised it. The manuscript was already strong, but with the addition of the summary table and the separated images, it is now excellent.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Nahas et al. investigated the roles of herpes simplex virus 1 (HSV-1) structural proteins using correlative cryo-light microscopy and soft X-ray tomography. The authors generated nine viral variants with deletions or mutations in genes encoding structural proteins. They employed a chemical fixation-free approach to study native-like events during viral assembly, enabling observation of a wider field of view compared to cryo-ET. The study effectively combined virology, cell biology, and structural biology to investigate the roles of viral proteins in virus assembly and budding.

      Strengths:

      (1) The study presented a novel approach to studying viral assembly in cellulo.

      (2) The authors generated nine mutant viruses to investigate the roles of essential proteins in nuclear egress and cytoplasmic envelopment.

      (3) The use of correlative imaging with cryoSIM and cryoSXT allowed for the study of viral assembly in a near-native state and in 3D.

      (4) The study identified the roles of VP16, pUL16, pUL21, pUL34, and pUS3 in nuclear egress.

      (5) The authors demonstrated that deletion of VP16, pUL11, gE, pUL51, or gK inhibits cytoplasmic envelopment.

      (6) The manuscript is well-written, clearly describing findings, methods, and experimental design.

      (7) The figures and data presentation are of good quality.

      (8) The study effectively correlated light microscopy and X-ray tomography to follow virus assembly, providing a valuable approach for studying other viruses and cellular events.

      (9) The research is a valuable starting point for investigating viral assembly using more sophisticated methods like cryo-ET with FIB-milling.

      (10) The study proposes a detailed assembly mechanism and tracks the contributions of studied proteins to the assembly process.

      (11) The study includes all necessary controls and tests for the influence of fluorescent proteins.

      Weaknesses:

      Overall, the manuscript does not have any major weaknesses, just a few minor comments:

      (1) The gel quality in Figure 1 is inconsistent for different samples, with some bands not well resolved (e.g., for pUL11, GAPDH, or pUL20).

      We thank the reviewer for their suggestion. We tried to resolve the bands several times, but unfortunately this was the best outcome we could achieve.

      (2) The manuscript would benefit from a summary figure or table to concisely present the findings for each protein. It is a large body of manuscript, and a summary figure showing the discovered function would be great.

      We thank the reviewer for their suggestion. We have created a summary table (Table 2).

      (3) Figure 2 lacks clarity on the type of error bars used (range, standard error, or standard deviation). It says, however, range, and just checking if this is what the authors meant.

      We thank the reviewer for double-checking, but it is meant to be range, as reported in the legend. We used range because there are only two data points for each time point, which are insufficient to calculate standard deviation or standard error.

      (4) The manuscript could be improved by including details on how the plasma membrane boundary was estimated from the saturated gM-mCherry signal. An additional supplementary figure with the data showing the saturation used for the boundary definition would be helpful.

      We appreciate the suggestion and have included an example of how saturated gM-mCherry signal was used to delineate the cytoplasm in Supp. Fig. 4A.

      (5) Additional information or supplementary figures on the mask used to filter the YFP signal for Figure 4 would be helpful.

      Thanks, we have adapted the text in the results section to clarify: “eYFP-VP26 signal was manually inspected to determine threshold values that filtered out background and included pixels containing individual or clustered puncta that represent capsids.”

      (6) The figure legends could include information about which samples are used for comparison for significance calculations. As the colour of the brackets is different from the compared values (dUL34), it would be great to have this information in the figure legend.

      Thanks, we have adapted Fig. 4B to make the colour of the brackets match the colour used for the ΔUL34 mutant, and we have included labels next to the brackets for clarity. We have applied similar adjustments to Fig. 5D & E and Supp. Fig. 4C.

      (7) In Figure 5B, the association between YFP and mCherry signals is difficult to assess due to the abundance of mCherry signal; single-channel and combined images might improve visualization.

      Thanks, we have provided split and combined channel views in Supp. Fig. 4B to improve visualization.

      (8) In Figure 6D, staining for tubulin could help identify the cytoskeleton structures involved in the observed virus arrays.

      We thank the reviewer for their suggestion, which we think would be interesting future work to build on the current study. Given the competitive nature of access to the cryoSIM and cryoSXT, CLXT, including staining for tubulin was outside the scope of additional experiments we were able to conduct at this time.

      (9) It is unclear in Figure 6D if the microtubule-associated capsids are with the gM envelope or not, as the signal from mCherry is quite weak. It could be made clearer with the split signals to assess the presence of both viral components.

      We have provided split channels to the figure to aid with visualization.

      (10) The representation of voxel intensity in Figure 8 is somewhat confusing. Reversion of the voxel intensity representation to align brighter values with higher absorption, which would simplify interpretation.

      We thank the reviewer for this suggestion. In contrast to fluorescence microscopy where high intensities reflect signal, low intensities represent signal (absorbance of X-rays) in cryoSXT. We respectfully decided not to reverse the values, as we believe that could cause more confusion. We have instead added a black-to-white gradient bar to illustrate that low voxel intensities correspond to dark signal in Fig 8.

      (11) The visualization in panel I of Figure 8 might benefit from a more divergent colormap to better show the variation in X-ray absorbance.

      We thank the reviewer for their suggestion. We experimented with a few different colour schemes but concluded that the current one produced the clearest results and was most accessible for color-blind viewers.

      (12) Figure 9 would be enhanced by images showing the different virus sizes measured for the comparative study, which would help assess the size differences between different assembly stages.

      We thank the reviewer for their suggestion and have included images to accompany the graph.

      Overall, this is an excellent manuscript and an enjoyable read. It would be interesting to see this approach applied to the study of other viruses, providing valuable insights before progressing to high-resolution methods.

      Reviewer #2 (Public review):

      Summary:

      For centuries, humans have been developing methods to see ever smaller objects, such as cells and their contents. This has included studies of viruses and their interactions with host cells during processes extending from virion structure to the complex interactions between viruses and their host cells: virion entry, virus replication and virion assembly, and release of newly constructed virions. Recent developments have enabled simultaneous application of fluorescence-based detection and intracellular localization of molecules of interest in the context of sub-micron resolution imaging of cellular structures by electron microscopy.

      The submission by Nahas et al., extends the state-of-the-art for visualization of important aspects of herpesvirus (HSV-1 in this instance) virion morphogenesis, a complex process that involves virus genome replication, and capsid assembly and filling in the nucleus, transport of the nascent nucleocapsid and some associated tegument proteins through the inner and outer nuclear membranes to the cytoplasm, orderly association of several thousand mostly viral proteins with the capsid to form the virion's tegument, envelopment of the tegumented capsid at a virus-tweaked secretory vesicle or at the plasma membrane, and release of mature virions at the plasma membrane.

      In this groundbreaking study, cells infected with HSV-1 mutants that express fluorescently tagged versions of capsid (eYFP-VP26) and tegument (gM-mCherry) proteins were visualized with 3D correlative structured illumination microscopy and X-ray tomography. The maturation and egress pathways thus illuminated were studied further in infections with fluorescently tagged viruses lacking one of nine viral proteins.

      Strengths:

      This outstanding paper meets the journal's definitions of Landmark, Fundamental, Important, Valuable, and Useful. The work is also Exceptional, Compelling, Convincing, and Solid. The work is a tour de force of classical and state-of-the-art molecular and cellular virology. Beautiful images accompanied by appropriate statistical analyses and excellent figures. The numerous complex issues addressed are explained in a clear and coordinated manner; the sum of what was learned is greater than the sum of the parts. Impacts go well beyond cytomegalovirus and the rest of the herpesviruses, to other viruses and cell biology in general.

      Reviewer #3 (Public review):

      Summary:

      Kamal L. Nahas et al. demonstrated that pUL16, pUL21, pUL34, VP16, and pUS3 are involved in the egress of the capsids from the nucleous, since mutant viruses ΔpUL16, ΔpUL21, ΔUL34, ΔVP16, and ΔUS3 HSV-1 show nuclear egress attenuation determined by measuring the nuclear:cytoplasmic ratio of the capsids, the dfParental, or the mutants. Then, they showed that gM-mCherry+ endomembrane association and capsid clustering were different in pUL11, pUL51, gE, gK, and VP16 mutants. Furthermore, the 3D view of cytoplasmic budding events suggests an envelopment mechanism where capsid budding into spherical/ellipsoidal vesicles drives the envelopment.

      Strengths:

      The authors employed both structured illumination microscopy and cellular ultrastructure analysis to examine the same infected cells, using cryo-soft-X-ray tomography to capture images. This combination, set here for the first time, enabled the authors to obtain holistic data regarding a biological process, as a viral assembly. Using this approach, the researchers studied various stages of HSV-1 assembly. For this, they constructed a dual-fluorescently labelled recombinant virus, consisting of eYFP-tagged capsids and mCherry-tagged envelopes, allowing for the independent identification of both unenveloped and enveloped particles. They then constructed nine mutants, each targeting a single viral protein known to be involved in nuclear egress and envelopment in the cytoplasm, using this dual-fluorescent as the parental one. The experimental setting, both the microscopic and the virological, is robust and well-controlled. The manuscript is well-written, and the data generated is robust and consistent with previous observations made in the field.

      Weaknesses:

      It would be helpful to find out what role the targeted proteins play in nuclear egress or envelopment acquisition in a different orthoherpesvirus, like HSV-2. This would confirm the suitability of the technical approach set and would also act as a way to validate their mechanism at least in one additional herpesvirus beyond HSV-1. So, using the current manuscript as a starting point and for future studies, it would be advisable to focus on the protein functions of other viruses and compare them.

      We appreciate the suggestion and agree that this would be a great starting point for future studies. At present, we do not have a panel of mutant viruses in HSV-2 or another orthoherpesvirus, and it would be significant work to generate them, so we consider this outside the scope of the current study.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) There are enough uncommon abbreviations in the text to justify the inclusion of an abbreviation list.

      We thank the reviewer for the suggestion, but we define all uncommon abbreviations at first mention and an abbreviations list is not part of eLife’s house style.

      (2) The complex paragraph on p. 7 would be much easier to digest if broken into smaller chunks. Consider similar treatment for other lengthy landmark-free blocks of text, e.g., the one that begins on p. 14. Subheadings would help.

      We thank the reviewer for this suggestion. We have divided large paragraphs into more easily digestible chunks throughout the manuscript, for example in the discussion where the previous monolithic 3rd paragraph has been divided into five shorter, focussed paragraphs.

      (3) Table 1 needs units.

      We thank the reviewer for noticing our omission and apologise for the oversight - the table has been updated accordingly.

      Reviewer #3 (Recommendations for the authors):

      (1) Toward the end of the manuscript, I missed some lines attempting to speculate on the origin/nature of the spherical/ellipsoidal vesicles providing the envelopment. Would it be possible to incorporate this in the Discussion section?

      Thank you for noticing that omission. We have now included a few lines speculating that they may represent recycling endosomes, trans-Golgi network vesicles, or a hybrid compartment.

      (2) I congratulate the authors. The work is robust, and I personally highlight the way they managed to include others' results merged with their own, providing a complete view of the story.

      We thank the reviewer for their kind words.

      Note to editors

      In addition to these responses to the reviewer’s comments, we have also now included in the methods section details of the Tracking of Indels by Decomposition (TIDE) analysis we performed (data in Supplementary Figure 3) that was omitted by mistake from the original submission.

    1. pagará em dobro

      A não fruição das férias no período correto gera o dever de pagar o dobro da remuneração.


      Obs.: Estava vigente a Súmula nº 450/TST, que previa o pagamento em dobro no caso de inobservância do prazo para pagamento das verbas relativas às férias, para além da hipótese legal do pagamento dobrado para fruição de férias fora do período adequado. Porém, foi julgada inconstitucional pela ADPF 501.

      Ementa: ARGUIÇÃO DE DESCUMPRIMENTO DE PRECEITO FUNDAMENTAL. CONSTITUCIONAL E TRABALHISTA. SÚMULA 450 DO TRIBUNAL SUPERIOR DO TRABALHO. PAGAMENTO DA REMUNERAÇÃO DE FÉRIAS EM DOBRO QUANDO ULTRAPASSADO O PRAZO DO ART. 145 DA CLT. IMPOSSIBILIDADE DE O PODER JUDICIÁRIO ATUAR COMO LEGISLADOR POSITIVO. AUSÊNCIA DE LACUNA. INTERPRETAÇÃO RESTRITIVA DE NORMA SANCIONADORA. OFENSA À SEPARAÇÃO DE PODERES E AO PRINCÍPIO DA LEGALIDADE. PROCEDÊNCIA. - 1. Os poderes de Estado devem atuar de maneira harmônica, privilegiando a cooperação e a lealdade institucional e afastando as práticas de guerrilhas institucionais, que acabam minando a coesão governamental e a confiança popular na condução dos negócios públicos pelos agentes públicos. Precedentes. - 2. Impossibilidade de atuação do Poder Judiciário como legislador positivo, de modo a ampliar o âmbito de incidência de sanção prevista no art. 137 da CLT para alcançar situação diversa, já sancionada por outra norma. - 3. Ausência de lacuna justificadora da construção jurisprudencial analógica. Necessidade de interpretação restritiva de normas sancionadoras. Proibição da criação de obrigações não previstas em lei por súmulas e outros enunciados jurisprudenciais editados pelo Tribunal Superior do Trabalho e pelos Tribunais Regionais do Trabalho (CLT, art. 8º, § 2º). - 4. Arguição julgada procedente.

      (ADPF 501, Relator(a): ALEXANDRE DE MORAES, Tribunal Pleno, julgado em 08-08-2022, PROCESSO ELETRÔNICO DJe-163 DIVULG 17-08-2022 PUBLIC 18-08-2022)

    1. for epoch in range(4): for iter in range(3): x = torch.rand(2, 3, 224, 224)

      这里x的生成写在训练批次的循环里面,循环三次,每次随机生成3张随机图片

    1. This rare poster also encourages women to take agricultural training courses at a university, something that was not often encouraged.

      first poster that is positive towards women

    2. In his arms he has Lady Liberty, whose breasts are exposed, perhaps to show that if the Germans made it to America, American women would be assaulted.

      interesting way of portraying this

    1. Reviewer #3 (Public review):

      To summarize: The authors' overfilling hypothesis depends crucially on the premise that the very-quickly reverting paired-pulse depression seen after unusually short rest intervals of << 50 ms is caused by depletion of release sites whereas Dobrunz and Stevens (1997) concluded that the cause was some other mechanism that does not involve depletion. The authors now include experiments where switching extracellular Ca2+ from 1.2 to 2.5 mM increases synaptic strength on average, but not by as much as at other synapse types. They contend that the result supports the depletion hypothesis. I didn't agree because the model used to generate the hypothesis had no room for any increase at all, and because a more granular analysis revealed a mixed population with a subset where: (a) synaptic strength increased by as much as at standard synapses; and yet (b) the quickly reverting depression for the subset was the same as the overall population.

      The authors raise the possibility of additional experiments, and I do think this could clarify things if they pre-treat with EGTA as I recommended initially. They've already shown they can do this routinely, and it would allow them to elegantly distinguish between pv and pocc explanations for both the increases in synaptic strength and the decreases in the paired pulse ratio upon switching Ca2+ to 2.5 mM. Plus/minus EGTA pre-treatment trials could be interleaved and done blind with minimal additional effort.

      Showing reversibility would be a great addition too, because, in our experience, this does not always happen in whole-cell recordings in ex-vivo tissue even when electrical properties do not change. If the goal is to show that L2/3 synapses are less sensitive to changes in Ca2+ compared to other synapse types - which is interesting but a bit off point - then I would additionally include a positive control, done by the same person with the same equipment, at one of those other synapse types using the same kind of presynaptic stimulation (i.e. ChRs).

      Specific points (quotations are from the Authors' rebuttal)

      (1) Regarding the Author response image 1, I was instead suggesting a plot of PPR in 1.2 mM Ca2+ versus the relative increase in synaptic strength in 2.5 versus in 1.2 mM. This continues to seem relevant.

      (2) "Could you explain in detail why two-fold increase implies pv < 0.2?"

      a. start with power((2.5/(1 + (2.5/K1) + 1/2.97)),4) = 2*power((1.3/(1 + (1.3/K1) + 1/2.97)),4);

      b. solve for K1 (this turns out to be 0.48);

      c. then implement the premise that pv -> 1.0 when Ca2+ is high by calculating Max = power((C/(1 + (C/K1) + 1/2.97)),4) where C is [Ca] -> infinity.

      d. pv when [Ca] = 1.3. mM must then be power((1.3/(1 + (1.3/K1) + 1/2.97)),4)/Max, which is <0.2.

      Note that modern updates of Dodge and Rahamimoff typically include a parameter that prevents pv from approaching 1.0; this is the gamma parameter in the versions from Neher group.

      (3) "If so, we can not understand why depletion-dependent PPD should lead to PPF."

      When PPD is caused by depletion and pv < 0.2, the number of occupied release sites should not be decreased by more than one-fifth at the second stimulus so, without facilitation, PPR should be > 0.8. The EGTA results then indicate there should be strong facilitation, driving PPR to something like 1.2 with conservative assumptions. And yet, a value of < 0.4 is measured, which is a large miss.

      (4) Despite the authors' suggestion to the contrary, I continue to think there is a substantial chance that Ca2+-channel inactivation is the mechanism underlying the very quickly reverting paired-pulse depression. However, this is only one example of a non-depletion mechanism among many, with the main point being that any non-depletion mechanism would undercut the reasoning for overfilling. And, this is what Dobrunz and Stevens claimed to show; that the mechanism - whatever it is - does not involve depletion. The most effective way to address this would be affirmative experiments showing that the quickly reverting depression is caused by depletion after all. Attempting to prove that Ca2+-channel inactivation does not occur does not seem like a worthwhile strategy because it would not address the many other possibilities.

      (5) True that Kusick et al. observed morphological re-docking, but then vesicles would have to re-prime and Mahfooz et al. (2016) showed that re-priming would have to be slower than 110 ms (at least during heavy use at calyx of Held).

    1. clarity around memberships and partnerships

      should look like. We believe in diversity.

      7ww w.si deways.earth

      Main benefits

      • Exchange and commerce
      • Cross-organization dispute resolution
      • International recognition
      • Member hosting and benefits across organizations
      • Values alignment / adherence to standards
    1. basal area is c⋅DBH2c⋅DBH2c\cdot \text{DBH}^2, where ccc is often referred to as the “foresters constant” and, depending on your measurement system, is either 0.005454 or 0.00007854

      I would not recommend to introduce any "foresters constants", it is just simple geometry if we want to calculate the area of a circle from its diameter. Students tend to learn such things by heart and forget about the fundamentals... Here the pi/10.000 comes in because DBH comes in centimeter (conversion from cm to meter) and the strange (DBH/2)^2 substitutes pi/4. I prefer to explain my students how to calculate the area of a circle and remind them that we want to have this in meter, instead of confusing them.

    1. Reviewer #4 (Public review):

      The manuscript reports on a large-scale study correlating genomic architecture with splicing complexity over almost 1,500 species. We still know relatively little about alternative splicing functional consequences and evolution, and thus, the study is relevant and timely. The methodology relies on annotations from NCBI for high-quality genomes and a main metric proposed by the authors and named Alternative Splicing Ratio (ASR). It quantifies the level of redundancy of each coding nucleotide in the annotated isoforms.

      According to the authors' response to the first reviewers' comments, the present version of the manuscript seems to be a profoundly revised version compared to the original submission. I did not have access to the reviewers' comments.

      Although the study addresses an important question and the authors have visibly made an important effort to make their claims more statistically robust, I have a number of major concerns regarding the methodology and its presentation.

      (1) A large part of the manuscript is speculative and vague. For instance, the Discussion is very long (almost longer than the Results section) and the items discussed are sometimes not in direct connection with the present work. I would suggest merging the last 2 paragraphs, for instance, since the before last paragraph is essentially a review of the literature without direct connection to the present work.

      (2) The Methods section lacks clarity and precision. A large part is devoted to explaining the biases in the data without any reference or quantification. The definition of ASR is very confusing. It is first defined in equation 2, with a different name, and then again in the next subsection from a different perspective on lines 512-518. Why build matrices of co-occurrences if these are, in practice, never used? It seems the authors exploit only the trace. A major revision, if I understood correctly, was the correction/normalisation of the ASR metric. This normalisation is not explained. The authors argue that they will write another paper about it, I do not think this is acceptable for the publication of the present manuscript. Furthermore, there is no information about the technical details of the implementation: which packages did the authors use?

      (3) Could the authors motivate why they do not directly focus on the MC permutation test? They motivate the use of permutations because the data contains extreme outliers and are non normal in most cases. Hence, it seems the Welch's ANOVA is not adapted. "To further validate our findings, we also conducted<br /> 148 a Monte Carlo permutation test, which supported the conclusions (see Methods)." Where is the comparison shown? I did not see any report of the results for the non-permuted version of the Welch's ANOVA.

      (4) What are the assumptions for the Phylogenetic Generalized Least Squares? Which evolution model was chosen and why? What is the impact of changing the model? Could the authors define more precisely (e.g. with equations) what is lambda? Is it estimated or fixed?

      (5) I think the authors could improve their account of recent literature on the topic. For instance, the paper https://doi.org/10.7554/eLife.93629.3, published in the same journal last year, should be discussed. It perfectly fits in the scope of the subsection "Evidence for the adaptive role of alternative splicing". Methods and findings reported in https://doi.org/10.1186/s13059-021-02441-9 and https://www.genome.org/cgi/doi/10.1101/gr.274696.120 directly concern the assessment of AS evolutionary conservation across long evolutionary times and/or across many species. These aspects are mentioned in the introduction on p.3. but without pointing to such works. Can we really qualify a work published in 2011 as "recent" (line 348-350)?

      The generated data and codes are available on Zenodo, which is a good point for reproducibility and knowledge sharing with the community.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an extensive body of work and an outstanding contribution to our understanding of the IFN type I and III system in chickens. The research started with the innovative approach of generating KO chickens that lack the receptor for IFNα/β (IFNAR1) or IFN-λ (IFNLR1). The successful deletion and functional loss of these receptors was clearly and comprehensively demonstrated in comparison to the WT. Moreover, the homozygous KO lines (IFNAR1-/- or IFNLR1-/- ) were found to have similar body weights, and normal egg production and fertility compared to their WT counterparts. These lines are a major contribution to the toolbox for the study of avian/chicken immunology.

      The significance of this contribution is further demonstrated by the use of these lines by the authors to gain insight into the roles of IFN type I and IFN-type III in chickens, by conducting in ovo and in vivo studies examining basic aspects of immune system development and function, as well as the responses to viral challenges conducted in ovo and in vivo.

      Based on solid, state-of the-art methods and convincing evidence from studies comparing various immune system related functions in the IFNAR1-/- or IFNLR1-/- lines to the WT, revealed that the deletion of IFNAR1 and/or IFNLR1 resulted in:<br /> (1) impaired IFN signaling and induction of anti-viral state;<br /> (2) modulation of immune cell profiles in the peripheral blood circulation and spleen;<br /> (3) modulation of the cecum microbiome;<br /> (4) reduced concentrations of IgM and IgY in the blood plasma before and following immunization with model antigen KLH, whereby also line differences in the time-course of the antibody production were observed;<br /> (5) decrease in MHCII+ macrophages and B cells in the spleen of IFNAR1 KO chickens, although the MHCII-expression per cell was not affected in this line; and<br /> (6) reduction in the response of αβ1 TCR+ T cells of IFNAR1 KO chickens as suggested by clonal repertoire analyses.

      These studies were then followed by examination of the role of type I and type III IFN in virus infection, using different avian influenza A virus strains as well as an avian gamma corona virus (IBV) in in ovo challenge experiments. These studies revealed: viral titers that reflect virus-species and strain-specific IFN responses; no differences in the secretion of IFN-α/β in both KO compared to the WT lines; a predominant role of type I IFN in inducing the interferon-stimulated gene (ISG) Mx; and that an excessive and unbalanced type I IFN response can harm host fitness (survival rate, length of survival) and contribute to immunopathology.

      Based on guidance from the in ovo studies, comprehensive in vivo studies were conducted on host-pathogen interactions in hens from the three lines (WT, IFNAR1 KO, or IFNLR1 KO). These studies revealed the early appearance of symptoms and poor survival of hens from the IFNR1 KO line challenged with H3N1 avian influenza A virus; efficient H#N1 virus replication in IFNAR1 KO hens, increased plasma concentrations of IFNα/β and mRNA expression of IFN-λ in spleens of the IFNAR1 KO hens; a pro-inflammatory role of IFN-λ in the oviduct of hens infected with H3N1 virus; increased proinflammatory cytokine expression in spleens of IFNAR1 KO hens, and Impairment of negative feedback mechanisms regulating IFN-α/β secretion in IFNAR1-KO hens and a significant decrease in this group's antiviral state; additionally it was demonstrated that IFN-α/β can compensate IFN-λ to induce an adequate antiviral state in the spleen during H3N1 infection, but IFN-λ cannot compensate for IFN-α/β signaling in the spleen.

      Strengths:

      (1) Both the methods and results from the comprehensive, well-designed, and well-executed experiments are considered excellent. The results are well and correctly described in the result narrative and well presented in both the manuscript and supplement Tables and Figures. Excellent discussion/interpretation of results.

      (2) The successful generation of the type I and type III IFN KO lines offers unprecedented insight and opens multiple new venues for exploring the IFN system in chickens. The new knowledge reported here is direct evidence of the high impact of this model system on effectively addressing a critical knowledge gap in avian immunology.

      (3) The thoughtful selection of highly relevant viruses to poultry and human health for the in ovo and in vivo challenge studies to examine and assess host-pathogen interactions in the IFNR KO and WT lines.

      (4) Making use of the unique opportunities in the chicken model to examine and evaluate the host's IFN system responses to various viral challenges in ovo, before conducting challenge studies in hens.

      (5) The new knowledge gained from the IFNAR1 and IFNLR1 KO lines will find much-needed application in developing more effective strategies to prevent health challenges like avian influenza and its devastating effects on poultry, humans, and other mammals.

      (6) The excellent cooperation and contributions of the co-authors and institutions.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    2. Reviewer #2 (Public review):

      Summary:

      This study attempts to dissect the contributions of type I and type III IFNs to the antiviral response in chickens. The first part of the study characterises the generation of IFNAR and IFNLR KO chicken strains and describes basic differences. Four different viruses are then tested in chicken embryos, while the subsequent analysis of the antiviral response in vivo is performed with one influenza H3N1 strain.

      Strengths:

      Having these two KO chicken strains as a tool is a great achievement. The initial analysis is solid. Clear effect of IFNAR deficiency in in vivo infection, less so for IFNLR deficiency.

      Weaknesses:

      (1) The antibody induction by KLH immunisation: No data indicated whether or not this vaccination induces IFN responses in wt mice, so the effects observed may be due to steady-state differences or to differential effects of IFN induced during the vaccination phase. No pre-immune results are shown. The differences are relatively small and often found at only one plasma dilution - the whole of Figure 4 could be condensed into one or two panels by proper calculation of Ab titers - would these titres be significantly different? This, as all of the other in vivo experiments, has not been repeated, if I understand the methods section correctly.

      (2) The basic conundrum here and in later figures is never addressed by the authors: Situations where IFN type 1 and 3 signalling deficiency each have an independent effect (i.e., Figure 4d) suggest that they act by separate, unrelated mechanisms. However, all the literature about these IFN families suggests that they show almost identical signalling and gene induction downstream of their respective receptors. How can the same signalling, clearly active here downstream of the receptors for IFN type 1 or type 3, be non-redundant, i.e., why does the unaffected IFN family not stand in? This is a major difference from the mouse studies, which showed a rather subtle phenotype when only one of the two IFN systems was missing, but a massive reduction in virus control in double KO mice (the correct primary paper should be quoted here, not only the review by McNab). Reasons could be a direct effect of IFNab on B cells and an indirect effect of IFNL through non-B cells, timing issues, and many other scenarios can be envisaged. The authors do not address this question, which limits the depth of analysis.

      (3) In the one in vivo experiment performed with chickens, only one virus was tested; more influenza strains should be included, as well as non-influenza viruses.

      (4) The basic conundrum of point 2 applies equally to Figure 6a; both KOs have a phenotype. Again in 6d, both IFNs appear to be separately required for Mx induction. An explanation is needed.

      (5) Line 308, where are the viral titers you refer to in the text? The statement that the results demonstrate that excessive IFNab has a negative impact is overstretched, as no IFN measurements of the infected embryos are shown here.

      (6) The in vivo infection is the most interesting experiment, and the key outcome here is that IFN type 1 is crucial for anti-H3N1 protection in chickens, while type 3 is less impactful. However, this experiment suffers from the different time points when chickens were culled, so many parameters are impossible to compare (e.g., weight loss, histopathology, IFN measurements, and more). Many of these phenomena are highly dynamic in acute virus infections, so disparate time points do not allow a meaningful comparison between different genotypes. What are the stats in 7b? Is the median rather than the mean indicated by the line? Otherwise, the lines appear in surprising places. SD must be shown, and I find it difficult to believe that there is a significant difference in weight, for e.g., IFNAR KO, unless maybe with a paired t test. What is the statistical test?

      (7) Figures 7e,f: these comparisons are very difficult to interpret as the virus loads at these time points already differ significantly, so any difference could be secondary to virus load differences.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors reveal that the availability of extracellular asparagine (Asn) represents a metabolic vulnerability for the activation and differentiation of naive CD4+ T cells. To deplete extracellular Asn, they employed two orthogonal approaches: activating naive CD4+ T cells in either PEGylated asparaginase (PEG-AsnASE)-treated medium or custom-formulated RPMI medium specifically lacking Asn. Importantly, they demonstrate that Asn depletion not only impaired metabolic reprogramming associated with CD4+ T cell activation but also reduced CD4+ helper T cell lineage-specific cytokine production, thereby ameliorating the severity of experimental autoimmune encephalomyelitis.

      Strengths:

      The experiments presented here are comprehensive and well-designed, providing compelling evidence for the conclusions. The conclusions will be important to the field.

      Weaknesses:

      (1) EAE is the prototypic T cell-mediated autoimmune disease model, and both Th1 and Th17 cells are implicated in its pathogenesis. In contrast, Th2 and Treg cells and their associated cytokines (such as IL-4 and IL-10) have been shown to play a role in the resolution of EAE, and potentially in the modulation of disease progression. Thus, it will be important to determine whether Asn depletion affects the differentiation of naive CD4+ T cells into corresponding subsets under Th2 and Treg polarization conditions, as well as the expression of lineage-specific transcription factors and cytokine production.

      (2) EAE is characterized by inflammation and demyelination in the central nervous system (CNS), leading to neurological deficits. Myelin destruction is directly correlated with the severity of the disease. For Figure 6, did the authors perform spinal cord histological analysis by hematoxylin and eosin (H&E) or Luxol fast blue (LFB) staining? This is important to rigorously examine pathological EAE symptoms.

    2. Reviewer #2 (Public review):

      While the importance of asparagine in the differentiation and activation of CD8 T cells has been previously reported, its role in CD4 T cells remained unclear. Using culture media containing specific amino acids, the authors demonstrated that extracellular asparagine promotes CD4 T cell proliferation. Consistent with this, depletion of extracellular asparagine using PEG-AsnASE suppressed CD4 T cell activation. Proteomic analysis focusing on asparagine content revealed that, during the early phase of T cell activation, most asparagine incorporated into proteins is derived from extracellular sources. The authors further confirmed the importance of extracellular asparagine in vivo, demonstrating improved EAE pathology.

      While the data are well organized and convincing, the mechanism by which asparagine deficiency leads to altered T cell differentiation remains unclear. It is also necessary to investigate the transporters involved in asparagine uptake. In particular, elucidating whether different T cell subsets utilize the same or distinct transport mechanisms would provide important insight into the immunoregulatory role of asparagine.

      (1) The finding that asparagine supplementation promotes T cell proliferation under various amino acid conditions is highly significant. However, the concentration at which this effect occurs remains unclear. A titration analysis would be necessary to determine the dose-dependency of asparagine.

      (2) The effects of asparagine deficiency occur during the early phase of T cell activation. Thus, it is likely that the transporters responsible for asparagine uptake are either rapidly induced upon activation or already expressed in the resting state. Since this is central to the focus of the manuscript, it is interesting to identify the transporter responsible for asparagine uptake during early T cell activation. A recent paper (DOI: 10.1126/sciadv.ads350) reported that macrophages utilize Slc6a14 to use extracellular asparagine. Is this also true for CD4+ T cells?

      (3) Given that depletion of extracellular asparagine impairs differentiation of Th1 and Th17 cells, it is possible that TCR signaling is compromised under these conditions. This point should be investigated by targeting downstream signaling molecules such as Lck, ZAP70, or mTOR. Also, does it affect the protein stability of master transcription factors such as T-bet and RORgt?

      (4) Is extracellular asparagine also important for the differentiation of helper T cell subsets other than Th1 and Th17, such as Th2, Th9, and iTreg?

      (5) Asparagine taken up from outside the cell has been shown to be used for de novo protein synthesis (Figure 3E), but are there any proteins that are particularly susceptible to asparagine deficiency? This can be verified by performing proteome analysis, and the effects on Th1/17 subset differentiation mentioned above should also be examined.

      (6) While the importance of extracellular asparagine is emphasized, Asns expression is markedly induced during early T cell activation. Nevertheless, the majority of asparagine incorporated into proteins appears to be derived from extracellular sources. Does genetic deletion of Asns have any impact on early CD4+ T cell activation? The authors indicated that newly synthesized Asns have little impact on CD8+ T cells in the Discussion section, but is this also true for CD4+ T cells? This could be verified through experiments using CRISPR-mediated Asns gene targeting or pharmacological inhibition.

    1. Reviewer #3 (Public review):

      Summary:

      This study aimed to investigate pseudouridylation across various RNA species in multiple bacterial strains using an optimized BID-seq approach. It examined both conserved and divergent modification patterns, the potential functional roles of pseudouridylation, and its dynamic regulation across different growth conditions.

      Strengths:

      The authors optimized the BID-seq method and applied this important technique to bacterial systems, identifying multiple pseudouridylation sites across different species. They investigated the distribution of these modifications, associated sequence motifs, their dynamics across growth phases, and potential functional roles. These data are of great interest to researchers focused on understanding the significance of RNA modifications, particularly mRNA modifications, in bacteria.

      Weaknesses:

      (1) The reliability of BID-seq data is questionable due to a lack of experimental validations.

      (2) The manuscript is not well-written, and the presented work shows a major lack of scientific rigor, as several key pieces of information are missing.

      (3) The manuscript's organization requires significant improvement, and numerous instances of missing or inconsistent information make it difficult to understand the key objectives and conclusions of the study.

      (4) The rationale for selecting specific bacterial species is not clearly explained, and the manuscript lacks a systematic comparison of pseudouridylation among these species.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary

      This paper summarises responses from a survey completed by around 5,000 academics on their manuscript submission behaviours. The authors find several interesting stylised facts, including (but not limited to):

      Women are less likely to submit their papers to highly influential journals (e.g., Nature, Science and PNAS).

      Women are more likely to cite the demands of co-authors as a reason why they didn't submit to highly influential journals.

      Women are also more likely to say that they were advised not to submit to highly influential journals.

      The paper highlights an important point, namely that the submission behaviours of men and women scientists may not be the same (either due to preferences that vary by gender, selection effects that arise earlier in scientists' careers or social factors that affect men and women differently and also influence submission patterns). As a result, simply observing gender differences in acceptance rates - or a lack thereof - should not be automatically interpreted as as evidence for or against discrimination (broadly defined) in the peer review process.

      Major comments

      What do you mean by bias?

      In the second paragraph of the introduction, it is claimed that "if no biases were present in the case of peer review, then we should expect the rate with which members of less powerful social groups enjoy successful peer review outcomes to be proportionate to their representation in submission rates." There are a couple of issues with this statement.

      First, the authors are implicitly making a normative assumption that manuscript submission and acceptance rates *should* be equalised across groups. This may very well be the case, but there can also be valid reasons - even when women are not intrinsically better at research than men - why a greater fraction of female-authored submissions are accepted relative to male-authored submissions (or vice versa). For example, if men are more likely to submit their less ground-breaking work, then one might reasonably expect that they experience higher rejection rates compared to women, conditional on submission.

      We do assume that normative statement: unless we believe that men’s papers are intrinsically better than women’s papers, the acceptance rate should be the same. But the referee is right: we have no way of controlling for the intrinsic quality of the work of men and women. That said, our manuscript does not show that there is a different acceptance rate for men and women; it shows that women are less likely to submit papers to a subset of journals that are of a lower Journal Impact Factor, controlling for their most cited paper, in an attempt to control for intrinsic quality of the manuscripts.

      Second, I assume by "bias", the authors are taking a broad definition, i.e., they are not only including factors that specifically relate to gender but also factors that are themselves independent of gender but nevertheless disproportionately are associated with one gender or another (e.g., perhaps women are more likely to write on certain topics and those topics are rated more poorly by (more prevalent) male referees; alternatively, referees may be more likely to accept articles by authors they've met before, most referees are men and men are more likely to have met a given author if he's male instead of female). If that is the case, I would define more clearly what you mean by bias. (And if that isn't the case, then I would encourage the authors to consider a broader definition of "bias"!)

      Yes, the referee is right that we are taking a broad definition of bias. We provide a definition of bias on page 3, line 92. This definition is focused on differential evaluation which leads to differential outcomes. We also hedge our conversation (e.g., page 3, line 104) to acknowledge that observations of disparities may only be an indicator of potential bias, as many other things could explain the disparity. In short, disparities are a necessary but insufficient indicator of bias. We add a line in the introduction to reinforce this. The only other reference to the term bias comes on page 10, line 276. We add a reference to Lee here to contextualize.

      Identifying policy interventions is not a major contribution of this paper

      I would take out the final sentence in the abstract. In my opinion, your survey evidence isn't really strong enough to support definitive policy interventions to address the issue and, indeed, providing policy advice is not a major - or even minor - contribution of your paper. (Basically, I would hope that someone interested in policy interventions would consult another paper that much more thoughtfully and comprehensively discusses the costs and benefits of various interventions!) While it's fine to briefly discuss them at the end of your paper - as you currently do - I wouldn't highlight that in the abstract as being an important contribution of your paper.

      We thank the referee for this comment. While we agree that our results do not lead to definitive policy interventions, we believe that our findings point to a phenomenon that should be addressed through policy interventions. Given that some interventions are proposed in our conclusion, we feel like stating this in the abstract is coherent.

      Minor comments

      What is the rationale for conditioning on academic rank and does this have explanatory power on its own - i.e., does it at least superficially potentially explain part of the gender gap in intention to submit?

      Thank you for this thoughtful question. We conditioned on academic rank in all regression analyses to account for structural differences in career stage that may potentially influence submission behaviors. Academic rank (e.g., assistant, associate, full professor) is a key determinant of publishing capacity and strategic considerations, such as perceived likelihood of success at elite journals, tolerance for risk, and institutional expectations for publication venues.

      Importantly, academic rank is also correlated with gender due to cumulative career disadvantages that contribute to underrepresentation of women at more senior levels. Failing to adjust for rank would conflate gender effects with differences attributable to career stage. By including rank as a covariate, we aim to isolate gender-associated patterns in submission behavior within comparable career stages, thereby producing a more precise estimate of the gender effect.

      Regarding explanatory power, academic rank does indeed contribute significantly to model fit across our analyses, indicating that it captures meaningful variation in submission behavior. However, even after adjusting for rank, we continue to observe significant gender differences in submission patterns in several disciplines. This suggests that while academic rank explains part of the variation, it does not fully account for the gender gap—highlighting the importance of examining other structural and behavioral factors that shape the publication trajectory.

      Reviewer #2 (Public review):

      Basson et al. present compelling evidence supporting a gender disparity in article submission to "elite" journals. Most notably, they found that women were more likely to avoid submitting to one of these journals based on advice from a colleague/mentor. Overall, this work is an important addition to the study of gender disparities in the publishing process.

      I thank the authors for addressing my concerns.

      Reviewer #4 (Public review):

      Main strengths

      The topic of the MS is very relevant given that across the sciences/academia, genders are unevenly represented, which has a range of potential negative consequences. To change this, we need to have the evidence on what mechanisms cause this pattern. Given that promotion and merit in academia are still largely based on the number of publications and the impact factor, one part of the gap likely originates from differences in publication rates of women compared to men.

      Women are underrepresented compared to men in journals with a high impact factor. While previous work has detected this gap and identified some potential mechanisms, the current MS provides strong evidence that this gap might be due to a lower submission rate of women compared to men, rather than the rejection rates. These results are based on a survey of close to 5000 authors. The survey seems to be conducted well (though I am not an expert in surveys), and data analysis is appropriate to address the main research aims. It was impossible to check the original data because of the privacy concerns.

      Interestingly, the results show no gender bias in rejection rates (desk rejection or overall) in three high-impact journals (Science, Nature, PNAS). However, submission rates are lower for women compared to men, indicating that gender biases might act through this pathway. The survey also showed that women are more likely to rate their work as not groundbreaking and are advised not to submit to prestigious journals, indicating that both intrinsic and extrinsic factors shape women's submission behaviour.

      With these results, the MS has the potential to inform actions to reduce gender bias in publishing, but also to inform assessment reform at a larger scale.

      I do not find any major weaknesses in the revised manuscript.

      Reviewer #4 (Recommendations for the authors):

      (1) Colour schemes of the Figures are not adjusted for colour-blindness (red-green is a big NO), some suggestions can be found here https://www.nceas.ucsb.edu/sites/default/files/2022-06/Colorblind%20Safe%20Color%20Schemes.pdf

      We appreciate the suggestion. We’ve adjusted the colors in the manuscript to be color-blind friendly using one of the colorblind safe palettes suggested by the reviewer.

      (2) I do not think that the authors have fully addressed the comment about APCs and the decision to submit, given that PNAS has publication charges that amount to double of someone's monthly salary. I would add a sentence or two to explain that publication charges should not be a factor for Nature and Science, but might be for PNAS.

      While APCs are definitely a factor affecting researchers’ submission behavior, it is mostly does so for lower prestige journals rather than for the three elite journals analyzed here. As mentioned in the previous round of revisions, Nature and Science have subscription options. And PNAS authors without funding have access to waivers: https://www.pnas.org/author-center/publication-charges

      (3) Line 268, the first suggestion here is not something that would likely work. Thus, I would not put it as the first suggestion.

      We made the suggested change.

      (4) Data availability - remove AND in 'Aggregated and de-identified data' because it sounds like both are shared. Suggest writing: 'Aggregated, de-identified data..'. I still suggest sharing data/code in a trusted repository (e.g. Dryad, ZENODO...) rather than on GitHub, as per the current recommendation on the best practices for data sharing.

      Thank you for your comment regarding data availability. Due to IRB restrictions and the conditions of our ethics approval, we are not permitted to share the survey data used in this study. However, to support transparency and reproducibility, we have made all analysis code available on Zenodo at https://doi.org/10.5281/zenodo.16327580. In addition, we have included a synthetic dataset with the same structure as the original survey data but containing randomly generated values. This allows others to understand the data structure and replicate our analysis pipeline without compromising participant confidentiality.

    1. The ion-ion interaction energy is inversely proportional to the distance between the ions (1/r), while the ion-dipole energy is inversely proportional to the square (1/r2). So doubling the distance decreases the first by a factor of 2, and the later by a factor of 4 (and tripling the distance decreases the first by a factor of 3, and the later by a factor of 9). So ion dipole interactions are much shorter ranged.

      In principle I understand but I don't understand how the equation explains the message it's trying to convey @rebelford

    1. Sharing of 2 electrons results in a single bond§ Sharing of 4 electrons is a double bond§ Sharing of 6 electrons is a triple bond

      aklanes, alkenes and alkynes

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers


      We thank the reviewers for their constructive feedback, which has greatly improved the clarity and rigor of our manuscript. We have carefully addressed each comment below, indicating changes made to the text, figures, or supplementary material where appropriate. References to line numbers correspond to the revised version of the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      * In this paper, the authors focus on the role of Reticulon-1C in concert with Spastin in response to axonal injury. In data mining, they find axonal mRNAs encoding for ER-associated proteins including Rtn-1. They establish a knockdown targeting both Rtn-1 isoforms Rtn-1A and Rtn-1C. They observe decreased beta-3-Tubulin levels in the soma while axonal protein levels are unchanged. In microfluidic devices, they characterise the effect of a compartment-specific Rtn-1 KD on axonal outgrowth in the axonal compartment. The authors quantify axonal outgrowth, seeing increased outgrowth in an axonal compartment-specific Rtn-1 KD, while the effect seems to be reversed when applying the KD construct in the somatic compartment. When focussing on the axonal growth cone, they find the Rtn-1 KD shows differences in several morphological features of the growth cone. They find an increase in Tubulin levels in an axonal compartment-specific, but a decrease in a somatic compartment-specific Rtn-1 KD. Colocalisation of Rtn-1C and Spastin is shown to be monolaterally increased following axotomy. Combining axotomy with the Rtn-1 KD shows increases in dynamic microtubule growth rates and track lengths. In another model system, neuron balls, they show Rtn1-C, but not Rtn1-A to be present in the axon. In a puro-PLA assay they also show it can be synthesised in the axonal compartment. To investigate the mechanism enabling the cooperation between Spastin and Rtn-1C, they move to a cell line model in which they see a correlating distribution between Spastin and Rtn-1C but not Rtn-1A. Finally, they use in silico modelling to speculate on binding between Spastin domains and Rtn-1 isoforms.*

      Major comment:

      The rationale behind the work is convincing, however some interpretations are presented as more robust than some data allow. Most notably, while the interaction between Rtn-1 and Spastin has been shown prior to this study, it is only presented here through in silico analysis. In figure 5, an increase in the growth rate of dynamic microtubules is observed in either a Rtn-1C KD or by using a Spastin-inhibitor. Due to a described increase in colocalisation between Rtn-1C and Spastin (5A), the increase in growth rate is displayed as caused by Rtn-1 promoting Spastin's severing ability. This result might however be correlative. Further in the injured samples, Spastin-levels seemingly increase (in the representative images) and it is thus not surprising that the level of Rtn-1C colocalising with Spastin increases as well. This might not be indicative of a cooperation and further experimental evidence are required.

      R: We thank the reviewer for this thoughtful comment. We agree that our interpretation should be more cautious, and we have revised the Title, Results and Discussion sections accordingly. In particular:

      1. Following yours and other reviewer comments, we have analyzed a new set of experiments regarding the STED images of non-injured and injured axons. To eliminate the risk of artifactual descriptions, we have avoided deconvolution and worked directly with raw STED images (Figure 5A). Under these conditions, the distribution of Spastin and its intensity in distal axons are not modified by injury, nor those of Rtn-1C and Spastin (Supplementary figure 4). We emphasize in the revised text that the in silico modeling we present is supportive, but not definitive, of a direct interaction. To address this concern, we clarify that our study builds on prior evidence of biochemical interaction between Rtn-1C and Spastin (Mannan et al., 2006), and that our own data demonstrate: i) compatible subcellular distribution in axons by super-resolution (STED microscopy, Figure 5A);ii) a potential functional interplay in axons (rescue of β3-tubulin levels by Spastin inhibition, Figure 5B), and iii) isoform-specific co-distribution with Spastin in heterologous cells that is associated with changes on microtubule integrity (see improved Figure 7). Together, these results go beyond correlative localization, but we acknowledge that they do not directly demonstrate a molecular complex in axons. Thus, we now indicate that "Although we did not directly test their molecular association, these results are consistent with Rtn-1C and Spastin sharing a similar subcellular localization, potentially enabling their functional interaction in distal axons" (lines 285-287)

      We would like to clarify a possible misunderstanding: in our experiments, the increase in microtubule growth rate was observed after axonal Rtn-1 KD. Spastazoline (SPTZ) only prevented the reduction in β3-tubulin levels induced by Rtn-1 KD, while leaving the KD-driven increase in growth rate and track length unaffected (Figures 5B-E). Thus, our interpretation is that axonal Rtn-1 KD correlates with increased Spastin function. (lines 307-309)


      Other comments:

      • Generally, graphs would benefit from individual values plotted as well as the summary. Font sizes and types (but rarely) are sometimes inconsistent. Proteins should be consistently written (capitalised or not).

      __R: __ We agree with the reviewer and thank for taking the time for noticing these inconsistencies as it significantly affects the quality of the work. We have improved several figures and added graphs plotting individual values (Figures: 2 C, 2E; 4 (A-E); 5E; 6D). We have reviewed the Font size and types more carefully and capitalized the proteins accordingly.

      • *Table 1 and figure 1 present data collected from a vast amount of resources. It should be highlighted that datasets from which data was obtained includes many different models, different DIVs and neuronal cell types. Figure 1B may benefit from a different colour scheme. "Ex-vivo" should be "Ex vivo". For "ER mRNAs are a relevant category" it is not described what "relevant" would mean in this context. The title might remove this small part or describe it in the text. It should be described how it is decided that mRNAs are "common". *

      • *

      __R: __We have now highlighted in the result section the diverse origins of the analyzed samples; We removed the indicated part from the text and explained that common mRNAs were chosen based on the Benjamini-Hochberg (Ben) analysis. (Page 33, lines 1299-1304).

      * - Figure 2: add description to y-axis to describe what fold change is displayed, applies to multiple figures. Will improve readability of the figures. In 2C, the ROI showing neuronal somata should be increased to show part of the axon and not cut off the soma.*

      • *

      __R: __We thank the reviewer for taking the time to highlight this. We have included this modification in figure 2 and throughout the article. We have also enlarged the indicated ROIs in figure 2C as requested. (Page 34)

      • *Figure 3: Three out of four axonal compartments seem to be comprised of dying or damaged axons. Especially the axonal KD scrambled image. It should be ensured that neuronal cultures are healthy. *

      • *

      __R: __We completely agree with the reviewer that the selected images were not describing the general good health of axons which has been accredited by the lack of fragmentation and functional responsiveness shown in (Figure 4 and 5 B, C, E). Thus, we have now replaced the previous axonal fields by more representative ones (Figure 3). (page 36)

      • *

      Typo in "intersections". The schematic of 3B is a great addition to explain the graphs above. Perhaps it could be a bit refined as it is currently hard to see whether this is a neuron or a growth cone without context. Maybe show where the axon connects to the depicted growth cones and change the third icon which looks like it was crossed out. Small formatting issues: remove additional space bar before "Figure 3." And add after "Bar"

      __R: __Many thanks for these great suggestions. We have now improved the figures as suggested and changed the indicated formatting issues. (page 36)

      - Figure 4: If not misunderstanding what is depicted, in 4A and B, different lookup tables are used to depict the same signal. Only one of each images is necessary. Do the axons have more tiny branches in the Rtn-1 KD condition in 4A? Unclear why Rtn-1 levels are increased in the Rtn-1 KD (4C), please clarify.

      • *

      __R: __We thank the reviewer for these observations. The reviewer is correct that different lookup tables were initially applied to the same image. Our intention was to highlight the fine distribution of axonal Rtn-1, but since this aspect is already clearly shown in previous figures, we now retain only a single lookup table. The appearance of tiny branches in the Rtn-1 KD condition represents an isolated observation and does not reflect a consistent or robust phenotype associated with Rtn-1 KD.

      As the reviewer points out, the increase of Rtn-1 in the cell bodies of injured neurons following axonal KD was initially surprising to us. However, this was a consistent phenomenon, as shown in the improved Figure 4. Of note, previous studies have reported that total Rtn-1C (but not Rtn-1A) levels increase in response to injury in cortical neurons(Fan et al., 2018). In our case, we interpret this as a compensatory somatic response triggered by the local reduction of Rtn-1 in injured axons. This interpretation is also consistent with the apparent lack of effect of siRNA on distal axonal Rtn-1 levels when applied locally after injury (while somatic application of the same siRNA does reduce axonal Rtn-1). Thus, after 24 hours of KD, the somatic upregulation of Rtn-1 may partially compensate for its expected local synthesis decrease. We have clarified this assumption in the revised text. (lines 247-251)

      - Figure 5: It may be easier to understand what "axotomy" samples are if just referred to as "injured" as later in the same figure. The procedure could also very briefly be explained in the results. 5C should depict AUC in µm2 not µm. 5D Spastin is barely visible, brightness and contrast should be adjusted to enhance visibility.

      • *

      __R: __We thank the reviewer for these helpful suggestions and have implemented the requested changes in Figure 5. Specifically:

      We now consistently refer to "axotomy" samples as "injured" throughout the figure and article. In addition, a brief explanation of the axotomy procedure has been added before Figure 2 and before figure 5, also the description has been clarified in Materials and methods. (lines 191-192) and (lines 289-290) and (lines 779-787)

      To improve the reproducibility of our outgrowth measurements, we revised this analysis approach. Based on previous work from a co-autor (McCurdy et al., 2019), instead of reporting the "relative number of intersections," we now present the total counts obtained from Sholl analysis of binarized axons (see Methods). To this end, we took advantage of the NeuroAnatomy plugin of FIJI, which more precisely tracks axon trajectories and makes the measurement more independent of axon width. Also, this new approach avoids the conflict we had with what we considered the "first line" after the groove ends, which was a bit of arbitrary. Accordingly, the correct term is now "summation of intersections (sum.)" at different distance bins, as reflected in Figure 5D. (page 40)

      For the former Figure 5D (now Figure 5B), we have improved the acquisition of representative images and applied a different set of lookup tables to enhance visibility. (page 40)

      - Figure 6: It should be made clear why it is necessary to switch to another model system just for 6A, please indicate this in the text. PCR bands seem very pixelated, check the quality. It is unclear why soma genes/proteins were only tested with either PCR or WB others with both. Rtn-1C and Rtn1-A should be presented in the same order in the PCR and WB panel. Correct "Rtn1-1A" typo. In 6D, 1.5 dots per soma seems like a low number. When normalized to the area the soma vs the axon occupies, the compartmentalization does not work? Maybe it makes sense to refine analysis or apply puromycin in the somatic compartment and analyze the axonal compartment as comparison?

      __R: __Many thanks for these observations. We have now included the following clarification in the text: "We sought to characterize the isoform expression of Rtn-1 mRNA and protein in both axons and cell bodies. Because microfluidic chambers yield only limited cellular material, we adopted an alternative culture approach using 'neuronballs.' This method enables the segregation of an axon-enriched fraction by mechanically separating axons from somato-dendritic structures" (lines 375-376).

      The resolution of PCR bands has been improved in the revised figure. Note that because the amount of cellular material is relatively scarce, we did not obtain too strong bands.

      The difference in the genes/proteins used for characterizing RNA and protein samples reflects our intention to treat both approaches as complementary. The PCR markers were primarily included to confirm sample purity, which also applies to the WB samples since they derive from the same preparation. In both assays, we used MAP2 as a dendritic marker to demonstrate axonal purity. While we acknowledge that the same genes could have been tested by both methods, we believe the results as presented adequately demonstrate the effective isolation of axons.

      We have switched the order of Rtn-1C/1A for consistency across PCR and WB panels and corrected the indicated typo in Figure 6A.

      We agree with the reviewer that an average of 1.5 puncta per soma initially appeared low. We have identified at least three reasons for this:

      First, the signal derives from only a 15-minute puromycin pulse, which is a very short labeling window. Second, our puro-PLA assay is particularly stringent, as ligation relies directly on puromycin- and Rtn-1C-labeled primary antibodies, without the additional spacing normally introduced by secondary antibodies. In standard PLA, the critical distance for amplification is ~30-40 nm, whereas in our assay this distance is even more restrictive. Third, in our initial analysis we applied an overly cautious threshold to define "true" amplification. We have now refined this threshold using a baseline defined by the absence of puromycin stimulation. With this improved criterion, we now quantify an average of ~5 puncta per soma and ~10 puncta per 1000 µm² of axonal area (Figure 6D and Supplementary Figure 3D). Assuming a neuronal soma diameter of 15 µm (area ≈ 176.71 µm²), this yields ~0.028 puncta per µm² in soma. In comparison, axons display ~0.01 puncta per µm², approximately one-third of the soma value, which is compatible with the idea thar cell bodies dominate neuronal protein synthesis.

      Following the reviewer's valuable suggestion, we performed additional quantifications in which puromycin was applied exclusively to the somatic compartment. Under these conditions, we still observed amplification in axons (~5 puncta per 1000 µm²), although this value was significantly lower than when puromycin was applied directly to axons. This analysis provided a novel appreciation of the puro-PLA technique in neurons: at least half of the signal originates in the axonal compartment, while a portion may reflect proteins synthesized in soma and transported anterogradely to the axon through yet-unknown mechanisms (potentially involving rapid anterograde transport) (Figure 6D). (page 42)

      • Figure 7: 7A shows two images depicting the same information that may not be needed. Can probably be removed. In 7B there is no negative (or any) correlation between Spastin levels and Tubulin, however later it is mentioned that Rtn-1C transports Spastin thus causing a decrease in Tubulin at certain locations? It is nclear if Spastin levels vary intensely between different samples. Mean intensity of the somatic area may be beneficial to rule this out. 7B Tubulin on the right top panel seems to have a decrease in Tubulin levels which is not visible due to the Y axis of Tubulin being set to a different range than the middle and lower panel. The average of line scans from multiple cells may be helpful to determine whether there is indeed no colocalization between Rtn-1A and Spastin. The provided representative images seem to show similar degrees of colocalization between Spastin and Rtn-1A/C.

      • *

      __R: __We thank the reviewer for these valuable observations and acknowledge that Figure 7 may have caused confusion. We have eliminated the fluorescence line-scan traces, as they can be biased depending on the region of the cell analyzed. Although this may not have been sufficiently emphasized in the text, we had already performed a quantitative colocalization analysis across multiple cells and independent experiments, using Mander's coefficients (Figure 7B). These analyses showed higher colocalization between Rtn-1C and Spastin compared to Rtn-1A. Regarding the concerns about variability in Spastin levels or possible bias from Y-axis scaling, we have eliminated those traces by the risk of bias. Also, we had already quantified the total tubulin fluorescence intensity across all the z-stacks and from multiple cells from independent experiments as shown in Figure 7C. To further rule out artifacts caused by variable transfection efficiency, we quantified total fluorescence intensity in both RFP and GFP channels across conditions. As shown in Supplementary Figure 6, no significant differences were observed, suggesting that the changes in tubulin reflect specific effects of Spastin/Rtn-1C co-expression rather than variability in expression levels.

      Results: - It would be helpful to reiterate the hypothesis at the start to ease the reading flow.

      __ R: __Many thanks, we have introduced a line reiterating the hypothesis as suggested (lines 117-118)

      - There seems to be minor redundancy in lines 132-138.

      • *

      __R: __Indeed, we have now removed the indicated phrase.

      • There are several spellings, proof-reading is recommended. For example, in line 136 should be "promotes". 160 "localla", 192 should be "the actin cytoskeleton".,194 should be "we first examined", 195 should be "Different", 223 "using", 259 "axons".

      __R: __We apologize for the spellings; we have now performed a careful proof-reading and introduced these corrections.

      - 154-155: Unclear, why the lower MW Rtn-1C was seen as more important.

      __R: __We apologize for not being clear enough. It is not necessarily more important, but we just took the Rtn-1C molecular weight as reference for the analysis considering that this isoform is the predominant in axons. In any case we have found a significant effect for both isoforms at least on siRNA 2 (data not shown), which is now expressed in the text (line 165-169) : "We also examined the 180 kDa band and found that siRNA 1 reduced expression to a mean of 0.41 relative to Scr, showing a strong trend that did not reach statistical significance (p = 0.05; N = 3; Wilcoxon test compared to 1, data not shown). In contrast, siRNA 2 further reduced expression to a mean of 0.29, which was statistically significant (p = 0.04; N = 3; Wilcoxon test compared to 1, data not shown)."

      - 167 results of 2E not stated before interpreting them.

      • *

      __R: __We have corrected this mistake.

      - 181 would suggest "outline" instead of "perimeter".

      • *

      __R: __We have considered this suggestion and included "outline", nevertheless the morphometric parameter is defined as perimeter, so we retained the term, but with the suggested clarification.

      • *

      - 183-184 "longest shortest path" is a confusing term.

      __R: __We agree that it is a confusing term, thus have now introduced multiple clarifications for the term in the leyend of figure 3 (page 36), and with more detail in a new section of Materials and methods (lines 697-699).

      • figure 4B should be referenced earlier in the sentence.

      __R: __We have corrected the sentence in the text.

      - 243-244 may be correlation. Rtn-1 and Spastin do not necessarily interact so that this result is achieved.

      • *

      __R: __Thanks for the clarification, we are aware that so far in the manuscript the conclusion is not correct, thus now we have stated at the end of the paragraph: "Together, these observations suggest that axonal Rtn-1 KD correlates with higher Spastin microtubule severing" (lines 307-309)

      - 246: In figure 1 the KD seemed to influence both Rtn-1 isoforms, why not here anymore? 259 "axons". 284 "counteract" instead of "suppress"?

      • *

      __R: __We acknowledge the confusion at this point of the article because of measuring a specific isoform. We now indicate that we will focus on Rtn-1C because of previous evidence of the literature pointing to an interaction of Rtn-1C with Spastin (line 264-267). Later we show that Rtn-1C is the predominant isoform in axons (Figure 6). We have corrected all the suggestions in the manuscripts.

      - 485: rephrase as the interaction between Rtn-1C with Spastin has not been shown directly in these experiments.

      __R: __Many thanks for the relevant clarification. Now, we have corrected:" Here, we have described an emerging mechanism relating Rtn-1C with the activity of Spastin, which is the most frequently mutated isoform in HSP (Hazan et al., 1999; Mannan et al., 2006)." (line 632-634). * Methods: 535 "in PBS". 543 citation error. 689-699 is it necessary to add a gaussian blur?*

      • *

      __R: __We have corrected the words and removed the wrong reference. Regarding the use of Gaussian blur, it is a very important point. We used this approach because, in our experimental conditions, it was critical to highlight moving particles that otherwise would go unnoticed by the noise. This was particularly manifest for the seemingly more "unorganized" movements of axonal microtubules after injury.

      References: Mannan, A U et al. appears twice in the citation list (36 and 44).

      * *R: Many thanks for the observation. Now we have corrected it.

      Reviewer #1 (Significance (Required)):

      Overall, this manuscript describes novel fundings which will be interesting to the neuronal cell biology community and scientists working on the field of neuronal injury and regeneration. It is well structured, and the data are mostly well presented but sometimes conclusions are over-interpreted. However, several points need to be addressed in a more convincing way.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Axonal mRNA localization and localized translation support many neuronal functions and is an important determinant of the regenerative potential of axons after injury. How this works mechanistically remains unclear. The authors present a well performed and technically challenging study in which they identify RTN-1 as a regulator of axonal outgrowth after injury. They provide evidence using experiments in microfluidic chambers that RTN1 is locally synthesized in axons. Interestingly, they identify a (local) interplay between RTN1 and Spastin which affects microtubules and thereby regulates the outgrowth of cortical axons after injury. This study provides an interesting new link between a locally synthesized protein (RTN1) and a microtubule-regulating protein Spastin that is changed upon axon injury. This provides an advance in our understanding in axon regeneration after injury and provides the basis for new studies that can further investigate this interplay. Although interesting, I have several concerns that should be clarified and are needed to substantiate the findings and model presented in this study.

      Major concerns:

      1. In figure 1, the authors provide an analysis of overlapping axonal mRNAs. There are more axonal transcriptome studies and a recent study by von Kugelgen and Chekulaeva (2020; doi: 10.1002/wrna.1590) already performed such an analysis, which included more studies. It would be good to mention this. It can be perceived that studies were now chosen to get the outcome that Rtn-1 is present in all studies. For example, von Kugelgen finds mRNA coding for RTN3, another ER structural protein, as present in 16 out of 20 studies analyzed. That said, the authors present more reasons to look at Rtn-1, so the selection to continue with this protein remains valid but can be written up differently so not to present it as the 'sole' ER-shaping protein consistently present in axonal transcriptomes. __R: __We appreciate this important observation to enrich the article; we are aware that the transcriptome data can be even further expanded to more recent studies. Thus, we have now included this reference in the main text and highlighted the relevant finding of RTN3. However, Kugelgen and Chekulaeva used data from dendrites/axons (neurites). Thus, we indicate that "...On a similar approach, but combining data from dendrites and axons, it was found that Reticulon-3 *mRNA is present in 16 out of 20 studies, further suggesting a wider presence of other mRNAs coding for ER structural proteins in axons " (line 128-131)

      2. The description of methods is currently insufficient and incomplete and does not allow for reproducibility of this study. For example, different Rtn-1 antibodies seem to be used in this study. Is the same antibody used for staining and WB? There is no listing of any of the antibodies used in the study and which one is used for which technique/experiment. This should be clarified and should be easy to do so in the methods section (antibody name, origin/company, dilution used) to enhance reproducibility of this study. This is not limited to primary antibodies and any information on secondary antibodies, including what was used for STED is completely missing.*

      3. *

      __R: __Thanks for these critical comments. First, we apologize for the former method version which was mistakenly not as accurate as it should. We have now revisited it and improved several points throughout this section. Regarding the use of primary and secondary antibodies, plasmids, siRNAs, and general reagents, they are all indicated in the Supplementary material, including company and dilution ("Reagent tables").

      • The timeline of KD experiments in Figures 2 and 3 are unclear. For the Western blot KD is performed at DIV7 and collected 48 hours later. However, this is not specified for the stainings done in Figure 2C-E. Is this also at DIV7 and then for 48 hours? In figure 3 the siRNA is added at DIV8 (together with axotomy) and outgrowth is measured 24 hours later. Is 24 hours sufficient to achieve knockdown? Is this also what was done for stainings? Later on in Figure 5B, 48 hours of KD is again used. It is unclear what the rationale of these differing timepoints is. Why was this chosen? Is the timeline also the reason for the difference in segment lengths chosen? In Figure 3, there is a significant effect on outgrowth in the KD in the 'mid-range' which is not present in Figure 5.*

      __R: __We regret the confusion, now all this information is explicitly clarified in the main text (lines 297-299) and the corresponding figure legends. We have strong reasons to have used these different time points. Figure 2 A-B is aimed at validating the siRNA against Rtn-1 thus we treated 7 DIV cultures for 48 hours to be sure of revealing a global effect by WB. In figure 2 C-D, we used the same 7 DIV cultures, but only for 24 hours. The reason for this is that, once the RNAi was validated, we explored its control on local synthesis in a shorter period based in previous literature supporting that axonal KD for 24 hours is sufficient for regulating axonal transcripts (Batista et al., 2017; Gracias et al., 2014; Lucci et al., 2020). We are also confident of using this time point based in the new supplementary figure 3D that shows a significant decrease on puro-PLA signal (indicative of Rtn-1C synthesis) 24 hours after axonal KD.

      In figure 3, we performed axotomy thus we had to wait a longer period for axons to grow (8 DIV) before fully cut them out, in this case we performed axonal KD from 8 to 9 DIVs. This is the same period used for the staining and quantifications shown in figure 4. All this is properly clarified in the main text and figures.

      In Figure 5 we performed a more challenging experiment that required to transfect cells with an EB3-GFP plasmid, then perform axotomy along with axonal KD as well as pharmacological treatment selectively in axonal compartment. First, we tried to measure microtubule dynamics under the same temporal frame of figure 3. Nevertheless, expression levels of EB3-GFP were not adequate for axonal measurements by live-cell imaging. Therefore, compared to figure 3, we increased the time frame after axotomy 24 hours (from 9 to 10 DIV) by this technical reason, but also to explore whether the changes on tubulin intensity might be revealed more clearly (which was the case, figure 5B). These considerations are now included in the main text

      Regarding the significant effect on outgrowth in the KD in the 'mid-range' which is not present in Figure 5. Given that in figure 5D axons are left growing for two days instead of one, the number of intersections and the differences between conditions is modified compared to figure 3, while retaining the overall trends. Note that to improve the reproducibility of our outgrowth measurements, we revised this analysis approach. Based on previous work of a co-autor (McCurdy et al., 2019), instead of reporting the "relative number of intersections," we now present the total counts obtained from the Sholl analysis of binarized axons (see Materials and methods). To this end, we took advantage of the NeuroAnatomy plugin of FIJI, which precisely tracks axon trajectories and makes the measurements more independent of axon width segmentation. Also, this new approach avoids the conflict we had with what we considered the "first line" after the groove ends, which was a bit of arbitrary. Accordingly, the correct term is now "summation of intersections (sum.)" at different distance bins, as reflected in the new Figure 5D.

      Could the authors provide a rescue condition for their siRNA (using a siRNA-resistant construct) to show that their siRNA is specific for RTN1. They nicely show the efficiency of the siRNA but not its specificity. This is crucial because if not specific, this will affect a large part of their study. They already have RTN1A and RTN1C constructs available. Such a rescue experiment should ideally also be performed for one or more of their phenotypic experiments, such as the one presented in Figure 3A or 5 to show that the phenotype is really RTN1 dependent. If done by re-expressing either RTN1A or RTN1C, this could provide insightful information on the relevant isoforms.

      __R: __We agree with the reviewer that this is a critical point. A major challenge in demonstrating the functional role of axonally synthesized proteins using a KD approach is that the rescue may also need to occur locally. Since axonal Rtn-1 appears to play a distinct role compared to its somato-dendritic counterpart (Figure 3), a siRNA-resistant construct would ideally require an axon-targeting sequence to restore local synthesis. As this is technically demanding, we have not yet been able to perform such an experiment, but we are actively working on identifying the optimal sequence to direct Rtn-1C to axons. Importantly, studies performing axonal KD typically rely on at least two independent siRNA sequences, thereby minimizing the likelihood that a phenotype arises from off-target effects. Thus, we have now validated a third siRNA (siRNA 3), which selectively downregulates Rtn-1C. Then, following the same experimental frame of figure 3, we performed axonal Rtn-1 KD after injury and observed that siRNA 3 also significantly increases the outgrowth of injured axons (Supplementary figure 2). This suggests that, at least this phenotype, is not product of an off-target effect. Complementarily, pharmacological rescue with the Spastin inhibitor SPTZ mitigated both the reduction in distal axonal β3-tubulin and the increase on axon outgrowth, supporting that the observed phenotypes are unlikely to arise from off-target effects. If these effects were due to random interference with unrelated mRNA targets, inhibition of an ostensibly independent target such as Spastin would not be expected to yield such a consistent rescue. Accordingly, SPTZ treatment alone did not increase β3-tubulin, indicating that its action is specifically contingent upon Rtn-1 KD. Taken together, the pharmacological rescue in axons (Figure 5B) and the Rtn-1C/Spastin co-distribution in heterologous cells, which correlates with preserved microtubules (improved Figure 7), provide converging evidence to suggest that Rtn-1C-Spastin interplay may underly the observed phenotypes in axons.

      • I find the data presented in Figure 4A/B confusing. Axonal RTN-1 KD does not reduce axonal RTN1 levels, but somatic KD does. I understand that this implies most protein comes from the soma, and the authors indeed present an explanation that increased somatic RTN1 occurs after axonal KD as a compensation mechanism. However, this can also be interpreted that there is no axonal synthesis of RTN1 after injury and axonal KD has indirect or even aspecific effects. Their model depends on this difference. Their data in Figure 6 could provide supporting evidence if it shows RTN1 puro-PLA after injury. Along these same lines, in Figure 6, they nicely include a compartment control for puro-PLA. It therefore seems doable to include a somatic puromycin control for their axonal puro-PLA, to exclude and diffusion/transport of the newly synthesized peptides. This is especially considering two recent papers reporting on this possible phenomenon, although these studies were not performed in neurons.*

      __R: __We consider the possibility that after injury there is no axonal Rtn-1 synthesis as a plausible and relevant appreciation. Unfortunately, we could not perform a puro-PLA experiment after injury, which would have provided a more definite answer. However, now we are more confident of regulating Rtn-1 synthesis before injury as supported by a new supplementary figure 3D that shows a significant decrease on puro-PLA signal (indicative of Rtn-1C synthesis) 24 hours after axonal KD. Thus, based on the similar phenotypes observed before and after injury, we consider our results are still compatible with Rtn-1 axonal synthesis being downregulated, but not absent after injury. First, axonal Rtn-1 KD decreased β3-tubulin levels before and after injury according to figure 5B and the improved statistical analysis performed on figure 2E. Similarly, axonal Rtn-1KD significantly increases microtubule growth rate before and after injury according to the current statistical comparisons (Figure 5E). Second, if β3-tubulin decrease was a merely unspecific siRNA targeting, it is unlikely that SPTZ treatment should increase and restore β3-tubulin levels only in the context of axonal Rtn-1 KD (Figure 5B). We have now included these considerations in the discussion (lines 537-543). Although on a different track, the mechanistic relationship between Rtn-1C and Spastin suggested in Figure 7 could make more plausible that a similar phenomenon regarding the control of tubulin levels may occur locally in axons.

      Following the reviewer's valuable suggestion, we performed additional quantifications in which puromycin was applied exclusively to the somatic compartment. Under these conditions, we still observed amplification in axons (~4 puncta per 1000 µm²), although this value was significantly lower than when puromycin was applied directly to axons (~10 puncta per 1000 µm²). This analysis provided a novel appreciation of the puro-PLA technique in neurons: at least half of the signal originates in the axonal compartment, while a portion may reflect proteins synthesized in soma and transported anterogradely to the axon through yet-unknown mechanisms (potentially involving rapid anterograde transport). Note that we revised the criteria for detecting true amplification spots based in staining without puromycin, which increased true amplification numbers. Still, these seemingly low values are compatible with reflecting a limited amount of time (only 15´ of puromycin pulse) and the stringent conditions of this experiment in which secondary antibodies were avoided by directly labeling primary ones. This approach makes the classical 30-40nm distance for PLA even narrower, thus reducing signal. In any case, assuming a neuronal soma diameter of 15 µm (area ≈ 176.71 µm²), this yields ~0.028 puncta per µm² in somata. In comparison, axons display ~0.01 puncta per µm², approximately one-third of the soma value, which makes sense for the expected difference in ribosome density.

      • In Figure 5A the authors find an increased co-localization (RTN1/Spastin) after axotomy. From their images, it seems that the amount of Spastin is hugely increased, which would by default increase the chance of (random) colocalization of RTN1 on Spastin. Could the authors comment on this?*

      __R: __Thanks for this relevant and constructive critique. We formerly based our colocalization analysis on deconvolved images. However, after performing several quantifications through different deconvolution parameters, we were not convinced about the robustness of this finding and the performed staining. Thus, we performed a new set of experiments and found that non-deconvolved images from the STED microscope were more informative about the expected tubular morphology of the axonal ER. Thus, we improved figure 5A, and now the main conclusion is just that both proteins are closely distributed in distal axons before and after injury.

      • In figure 5E and 5F, the condition of scr + SPTZ is omitted. What is the reason for this? The explanation of results in these figures is confusing. The authors report a 'clear trend' in increase in comet track length and lifetime upon addition of SPTZ to axonal RTN-1 KD. This is however not significant. The comparisons that are made afterwards are confusing (e.g. increase in comet lifetime of SPTZ in non-injured axons with RTN1 KD compared to Scr+DMSO and KD + DMSO in injured axons). Their conclusion is axonal RTN-1 synthesis in injured axons (see my concern in the points above on this) governs microtubules growth rate beyond Spastin activity yet blocking Spastin activity still completely blocks the effect of KD on outgrowth.*

      * *__R: __We thank this observation and fully agree that the general description provided in figure 5 E wasn't satisfactory. We have re-organized the descriptions of these results and performed more relevant statistical comparisons (lines 338-359). Based on the reviewer observation, we now conclude: "Together, these results suggest that axonal Rtn-1 synthesis controls microtubule dynamics in both non-injured and injured axons, mostly independently of Spastin-mediated microtubule severing." (lines 357-359).

      Other/minor concerns:

      - The gene ontology analysis in Figure 1A contains the category 'Endoplasmic reticulum'. In this category are mainly ribosomal proteins. Although in a gene ontology analysis these proteins will be included in this category, it is misleading in this respect since they are just as likely to be coming from cytoplasmic ribosomes. Although it cannot be excluded that these are ER-bound ribosomes, not in the last place because a recent study (Koppers et al., 2024, doi: 10.1016/j.devcel.2024.05.005) found ribosomes attached to the ER in axons, I believe the category should be adapted or at the least clarified in the text.

      • *

      __R: __Many thanks for the suggestion, which is now included in the text. "Note that several of the identified transcripts in the category 'endoplasmic reticulum' code for cytoplasmic ribosomal components, which indeed can be attached to the axonal ER (Koppers et al., 2024) and be locally synthesized in axons (Shigeoka et al., 2019)." (lines 125-128)

      - Is RTN-1C isoform still an ER-shaping protein or rather an ER protein with alternative functions? The final sentence in the abstract makes a statement that a locally synthesized ER-shaping protein lessens microtubule dynamics. Could the authors provide a clearer description and discussion of the evidence in literature for this? RTN1C has been suggested to perform alternative functions in which case the statement that the local synthesis of an ER-shaping protein is important for axonal outgrowth should be adapted.

      R: We agree with the reviewer and are aware that some non-canonical roles of Rtn-1C may partially explain the observed phenotypes. Thus, we have rephrased the last statement of the abstract: "These findings uncover a mechanism by which axonal protein synthesis provides fine control over the microtubule cytoskeleton in response to injury.". Also, we have modified the discussion section introducing new references accordingly..." Some studies have pointed to a non-canonical role for Rtn-1C in the nucleus, including DNA binding and histone deacetylase inhibition (Nepravishta et al., 2010, 2012). It is tempting to speculate that these still emerging roles may also contribute to the observed phenotypes. Of note, different axonally synthesized proteins exert transcriptional control in response to injury or local cues (Twiss et al., 2016)." (lines 576-580).

      • Is there a difference in RTN1 distribution or levels pre- and post-axotomy?

      R: Thanks for the suggestion, with the new analysis we have only found slight reorganization of Rtn-1C and Spastin in distal axons (Figure 5A). We have also included now quantification of their levels and found no significant differences for both proteins (Supplementary figure 4)

      - Line 100/101 states 'the interactome of the axonal ER provides...'. To my knowledge there has been no study looking at the interactome of the axonal ER specifically. Surely axonal ER proteins are known but there is a difference.

      • *

      __R: __We agree with the reviewer that the phrase was misleading, so we rephrased it in the introduction "...Different lines of evidence support that the protein components of the axonal ER may interact with proteins that regulate microtubule dynamics"

      * - Typo line 160 'localla'*

      • *

      __R: __Thanks for taking the time, we have now corrected it.

      - In Figure S1 B, please add the DIVs to make it clearer what each graph corresponds to. The legend of S1B states different distances from the cell body but the graph shows distances from the tip.

      • *

      __R: __We have now corrected the legend accordingly.

      - Figure 2C, why does B3 tubulin decrease in soma, aspecific effect of siRNA?

      • *

      __R: __This was indeed an unexpected finding. However, we do not observe unspecific or global changes in β3-tubulin levels (see Figure 2A and Supplementary Figure 2). Considering our other results linking Rtn-1 to the regulation of the microtubule cytoskeleton, we interpret this decrease as an indirect effect of Rtn-1 depletion rather than an off-target action of the siRNA. Moreover, if the effect were unspecific, both proteins would likely be reduced in the cell body, given that the siRNA was specifically designed to target Rtn-1 as its primary sequence-specific target.

      - What is the rationale on the opposite effect found in outgrowth in Figure 3?

      • *

      __R: __The apparent opposite outcomes observed in Figure 3 - where axonal versus somatic Rtn-1 knockdown leads to divergent effects on axonal outgrowth - can be explained by compartment-specific environments and isoform distribution. The siRNA targets the conserved RHD region, reducing both Rtn-1A and Rtn-1C. Axons are enriched in Rtn-1C. Thus, axonal KD preferentially reduces Rtn-1C. In contrast, somatic KD reduces both isoforms. Rtn-1A, predominant in cell bodies, may probably engage other signaling pathways (Kaya et al., 2013). Interestingly, it was reported by Nozumi et al. (2009b) that global Rtn-1 depletion reduces axonal outgrowth in developing cortical neurons. This aligns with the notion that somatic KD mimics a more global loss of function, whereas axonal KD reveals a compartmentalized, pro-regenerative effect due to local Rtn-1C regulation. (All the references indicated here are in the main manuscript). These considerations are now included in the discussion ( lines 581-593).

      * - Missing word 'we' on line 194*

      • *

      __R: __ We have corrected it.

      - Typo line 629 'witmn h', please proofread the entire manuscript carefully.

      • *

      __R: __ We apologize for the spellings, now we have carefully revised the manuscript.

      - Could the authors comment on why, in Figure 7B/C, GFP only is colocalizing with Spastin-RFP? In general, GFP should be diffusive and not display punctate colocalization with Spastin.

      • *

      We appreciate the reviewer's comment. Under normal conditions, GFP displays a diffuse cytoplasmic distribution. However, in our experimental setup, we observed punctate GFP signals only in the context of co-expression with Spastin-RFP. This is consistent with prior reports showing that soluble GFP can occasionally be sequestered into late endosomal structures (Sahu et al., 2011), which are also known to harbor the M87 Spastin isoform (Allison et al., 2013; Allison et al., 2019). To rigorously exclude the possibility of unspecific fluorescence crosstalk, we independently acquired each fluorophore channel and confirmed that GFP puncta were genuine and not due to bleed-through (Supplementary Figure 5). Further, cells expressing only GFP or only Spastin-RFP did not show overlapping puncta, and co-expression of GFP with Rtn-1A-RFP did not produce any apparent overlap, indicating that the punctate GFP pattern is specifically associated with Spastin co-expression. Thus, the observed GFP colocalization with Spastin reflects a biological phenomenon potentially linked to the endosomal localization of M87 Spastin, and not an artifact of imaging or fluorophore bleed-through.

      Reviewer #2 (Significance (Required)):

      * Axonal mRNA localization and localized translation support many neuronal functions and is an important determinant of the regenerative potential of axons after injury. How this works mechanistically remains unclear. The authors present a well performed and technically challenging study in which they identify RTN-1 as a regulator of axonal outgrowth after injury. They provide evidence using experiments in microfluidic chambers that RTN1 is locally synthesized in axons. Interestingly, they identify a (local) interplay between RTN1 and Spastin which affects microtubules and thereby regulates the outgrowth of cortical axons after injury. This study provides an interesting new link between a locally synthesized protein (RTN1) and a microtubule-regulating protein Spastin that is changed upon axon injury. This provides an advance in our understanding in axon regeneration after injury and provides the basis for new studies that can further investigate this interplay. Although interesting, I have several concerns that should be clarified and are needed to substantiate the findings and model presented in this study.*

      *

      The audience for this study will be mainly basic research in the fields of both axonal protein synthesis and axon regeneration. My expertise is in the field of mRNA localization and local protein synthesis.*

      Batista, A. F. R., Martínez, J. C., & Hengst, U. (2017). Intra-axonal synthesis of SNAP25 is required for the formation of presynaptic terminals. Cell Reports, 20(13), 3085. https://doi.org/10.1016/J.CELREP.2017.08.097

      Fan, X. xuan, Hao, Y. ying, Guo, S. wen, Zhao, X. ping, Xiang, Y., Feng, F. xue, Liang, G. ting, & Dong, Y. wei. (2018). Knockdown of RTN1-C attenuates traumatic neuronal injury through regulating intracellular Ca2+ homeostasis. Neurochemistry International, 121, 19-25. https://doi.org/10.1016/J.NEUINT.2018.10.018

      Gracias, N. G., Shirkey-Son, N. J., & Hengst, U. (2014). Local translation of TC10 is required for membrane expansion during axon outgrowth. Nature Communications 2014 5:1, 5(1), 1-13. https://doi.org/10.1038/ncomms4506

      Lucci, C., Mesquita-Ribeiro, R., Rathbone, A., & Dajas-Bailador, F. (2020). Spatiotemporal regulation of GSK3β levels by miRNA-26a controls axon development in cortical neurons. Development (Cambridge), 147(3). https://doi.org/10.1242/DEV.180232,

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript investigates the relationship between the endoplasmic reticulum morphogen reticulon-1 (Rtn-1) and the microtubule severing protein spastin in axons after injury. The main message and conclusion of the paper is that local axonal synthesis of Rtn-1 plays a role in regulating the microtubule severing activity of spastin by interacting with spastin and inhibiting its activity. This mechanism would be important after injury by regulating axonal growth.

      * The conclusions of the paper are based on the following claims:*

      * 1) Rtn-1 is synthesized locally in axons.*

      * 2) Specific downregulation in Rtn-1 in axons using microfluidic chambers affects microtubules abundance (measured by beta-3 tubulin) and promotes axon growth after injury.*

      * 3) Inhibition of spastin MT-severing activity with a specific drug rescues the growth effect induced by axonal downregulation of Rtn-1.*

      * 4) Rtn-1c interacts with spastin-M87 to limit its MT-severing activity in a cellular system upon overexpression.*

      *

      *

      Major comments:

      1) Evidence that Rtn-1 is synthesized in axons comes from two experiments. Initially, the authors show that Rtn-1 siRNA transfection in the axonal compartment of microfluidic chambers reduces Rtn-1 levels in axons, suggesting that there is some local synthesis. Although this method is very attractive, I am concerned about the statistical analysis. The graphs show bars rather than individual data points from the average of many neurons (about 300). The plots also show the SEM instead of the SD, thus covering all the variability that is inherent in this type of experiment. The statistics are probably not performed on the 3 biological replicates, but consider the individual neurons as N. This is obviously not correct, since neurons in an experiment may all be affected by the same technical problem and are not independent replicates. For this reason, I am a bit skeptical about this quantification. Another problem is that the quantification of the fluorescence intensity of the sample does not take the nuclei into account. Are the nuclei removed for analysis? Are the images single planes? Addressing the quantification issues is crucial also for data in Figure 4, where the authors show a different effect of Rtn-1 axonal KD after injury.

      * The second experiment is the Puro-PLA in Figure 6D. This experiment shows an average of 1.5 dots of signal per soma, which is a very low level of translation for this compartment where most of the synthesis should be taking place. In the axons, it is not clear how they calculate the axonal area. Again, the number of dots detected is very low and the physiological significance is questionable. A control with a known mRNA translated in axons would be important.*

      * Finally, as an important control, the authors should show the presence of Rtn-1 mRNA by FISH in their experimental system.*

      __R: __We appreciate the critical points addressed here as they moved us to improve the quality of the findings. We analyzed cells/axons as statistical units to increase statistical power given the subtle nature of these local changes. We agree with the reviewer that this approach may increase the risk of finding false positives. To address this point, i) we plotted the individual data points and colored them according with the different experimental dates (all the dates showed a similar trend) ii) We indicated SD instead of SEM iii) We analyzed our data using linear mixed-effects models, with experimental date included as a random effect. This approach allows to preserve the granularity and statistical power, while avoiding pseudoreplication. To exclude artifactual changes, we now analyzed the intensity fold change of total fluorescence normalized to Scr. Our former quantifications were based on the corrected fluorescence intensity used to construct the plot profiles, which could be adding some distortion to the measurements. These changes were applied throughout figures 2 and 4 (pages 34 and 38, respectively). After these new analyses the formerly presented results remain valid.

      We thank the reviewer for raising concerns about the quantification of fluorescence intensity in cell bodies. We now specify in Materials and methods that fluorescence intensity analysis of distal axons (always isolated by the microfluidic chambers) and of cell bodies was performed using the wide-field configuration of the microscope. In all the cases, a single (epifluorescent) plane was analyzed to reflect the total fluorescence of a cell or axon. We did not exclude the nuclear region from the quantifications, as this would also remove cytoplasmic signal located above or below the nucleus.

      We also understand the concerns about puro-PLA experiments. We agree with the reviewer that an average of 1.5 puncta per soma initially appeared low. We have identified at least three reasons for this. First, the signal derives from only a 15-minute puromycin pulse, which is a short labeling window. Second, our puro-PLA assay is particularly stringent, as ligation relied directly on puromycin- and Rtn-1C-labeled primary antibodies, without the additional spacing normally introduced by secondary antibodies. In standard PLA, the critical distance for amplification is ~30-40 nm, whereas in our assay this distance is even more restrictive. Third, in our initial analysis we applied an overly cautious threshold to define "true" amplification. We have now refined this threshold using a baseline defined by the absence of puromycin stimulation. With this improved criterion, we now quantify an average of ~5 puncta per soma and ~10 puncta per 1000 µm² of axonal area (Supplementary Figure 3D). As it is now included in methods, we calculated the axonal area by binarizing β3-tubulin staining and only counted the true amplification spots inside this region. Assuming a neuronal soma diameter of 15 µm (area ≈ 176.71 µm²), this yields ~0.028 puncta per µm² in somata. In comparison, axons display ~0.01 puncta per µm², approximately one-third of the soma value which seems more reasonable. This is also compatible with most of Rtn-1C synthesis comes from the cell body.

      Unfortunately, we could not be able to perform puro-PLA of other axonally synthesized proteins. Nevertheless, to further validate our puro-PLA signal, we tested the specificity of the Rtn-1C antibody we used for this assay by WB, IF, and Rtn-1 KD (Supplementary figure 3 A-C). In addition, we performed axonal Rtn-1 KD in microfluidic chambers for twenty-four hours, which elicited a significant decrease in puro PLA signal compared to Scr (Supplementary figure 3D). Together, these results strongly indicate that the quantified signal reflects Rtn-1C synthesis. To prove that Rtn-1 mRNA is present in these conditions, we now included a RT-PCR performed on RNA isolated from the somato-dendritic and pure axonal fractions of 8 DIV microfluidic chambers (Supplementary figure 3D). Note that the presence of this mRNA in axons has been supported by several studies, one of them using cortical neurons of similar DIV and cultured in microfluidic chambers (Table I and figure 1).

      2) The effects on tubulin following Rtn-1 downregulation in axons is potentially very interesting, but the authors should be careful because it could also mean that the axons are suffering. Can they also stain for other cytoskeletal markers?

      R: Regarding this concern, we are aware that in the former Figure 3 we mistakenly selected axonal fields that did not display healthy axons, which was not the dominant trend. This is accredited by the lack of fragmentation and by the functional responsiveness (microtubule dynamics) shown in Figures 4 and 5B, C, E. We have now replaced the previous axonal fields in Figure 3 with more representative axons (healthy), devoid of varicosities and fragmentation (page 37)

      3) The results using SPTZ are very interesting and implicate spastin microtubule severing activity in the observed phenotype. In my opinion these experiments however do not prove that "axonal Rtn-1 is indeed promoting the severing of microtubules by spastin", but simply that the blocking spastin activity prevents the appearance of the microtubular phenotype (which appears still with a mysterious mechanism). What happens if they try to stabilize the cytoskeleton by another mean (with taxol for example?). The authors should rephrase this conclusion.

      __R: __We completely agree with the reviewer's appreciation. We now explicitly indicate in the main text that this is (so far in the manuscript) a still correlative phenomenon that suggests an interplay with Spastin activity "..Together, these results suggest that locally synthesized Rtn-1 normally acts to suppress the outgrowth of injured axons, a process that could involve the microtubule-severing activity of Spastin." (lines 321-323). Later in the article, with the improved Figure 7, we further propose that these findings may reflect a causal relationship, although this mechanism has not yet been directly demonstrated in axons.

      4) The last experiment (Figure 7) that aims to connect Rtn-1 and spastin function is very artificial, since it is based on overexpression. Why should spastin M87 interact with an ER morphogen? Endogenously it is conceivable that spastin M1 which localizes to the ER would interact with Rtn-1. Moreover, this experiment needs further controls and quantifications. First, it is quite obvious from panel 7C that there is crossover of signal in the two fluorescence channels (see GFP and spastin). Controls need to be shown, where only one of the two fluorescent proteins is expressed, and the specificity of the laser is tested. This experiment is based on only 1 cell shown where co-localisation is detected based on a line that is placed in a specific area of the cell. The effects on the microtubular network needs quantification.

      __R: __We have now improved Figure 7 and added the requested controls to rule out crosstalk as indicated in Supplementary Figure 5 and in the main text. We agree that under normal conditions GFP should display a diffuse cytoplasmic distribution. However, in our experimental setup, we observed punctate GFP signals only in the context of co-expression with Spastin-RFP. This is consistent with prior reports showing that soluble GFP can occasionally be sequestered into late endosomal structures (Sahu et al., 2011), which are also known to harbor the M87 Spastin isoform (Allison et al., 2013; Allison et al., 2019). To exclude the possibility of unspecific fluorescence crosstalk, we independently acquired each fluorophore channel and confirmed that GFP puncta were genuine and not due to bleed-through (Supplementary Figure 5). Further, cells expressing only GFP or only Spastin-RFP did not show overlapping puncta (arrowheads), and the co-expression of GFP with Rtn-1A-RFP did not produce any apparent overlap, indicating that the punctate pattern of GFP is specifically associated with Spastin co-expression. Thus, we consider that the observed GFP colocalization with Spastin potentially reflects a true phenomenon and not an artifact of imaging or fluorophore bleed-through.

      We thank for these observations and apologize for the confusion in the outline of the former figure 7 and the lack of a better description. As the reviewer indicates, one interesting aspect of the M87 isoform is that lacks the ER morphogen domain (so is soluble or cytoplasmic in principle). However, it also harbors endosome and microtubule binding domains which according to previous literature (now included in the main text) may render it a punctate rather than a homogeneous pattern. Also, M87 is the most abundant isoform in the nervous system, particularly at early development. This is the reason why we selected this isoform to test our model. To clarify this point, we based our colocalization analysis in different cells and experimental dates and analyzed all the z-stacks for each cell (see new figure 7B and methods), the intensity plots (now removed) were only for graphical purposes. Similarly, we had already quantified the total tubulin intensity in COS cells based on many cells from different dates and included the sum projections of all the z-stacks from these cells (see new figure 7C). Thus, we removed the intensity profiles as they were clearly misleading (see new figure 7).

      We agree that over-expressing constructs may force interactions or co-distribution of proteins. However, in this case, if the observed results were mainly due to over-expression, we should see a similar trend with isoform A as both constructs are under the control of the same strong promoter (CMV) and harbor the same ER morphogen domain (RHD). Nevertheless, the distribution of M87 tightly mirrors Rtn-1C, which is not the case for Rtn-1A. Only as a theoretical prediction, our molecular modeling suggests that Rtn-1C may be associated with Spastin through its microtubule binding domain (Figure 7E). This would suppose that Spastin "decorates" ER-tubules rather than being in the same ER membranous structure. This discrete pattern of Spastin is more coherent with the distribution of both proteins that is now more clearly observed in distal axons by STED super-resolution (new figure 5A). So, despite a bit unexpected, these results suggest a novel interaction mechanism between these two proteins that deserves further validation.

      5) What is exactly the model proposed? The title implies that axonal synthesis of Rtn-1 is important during injury, but the data in the paper rather suggest that upon injury the majority of Rtn-1 is not locally synthesized. If the levels of Rtn-1 do not change, why the effect on the microtubules should be specific? Why would a siRNA against Rtn-1 in axons not affect the levels of Rtn-1, but those of tubulin? The authors should be careful, and test other control siRNAs, and Rtn-1 siRNAs, since it is well known even in more simple cellular systems that the toxicity of individual siRNAs can vary greatly.

      We consider the possibility that after injury there is no axonal Rtn-1 synthesis as a plausible and relevant appreciation. Unfortunately, we could not perform a puro-PLA experiment after injury, which would have provided a more definite answer. However, now we are more confident of regulating Rtn-1 synthesis before injury as supported by a Supplementary figure 3D that shows a significant decrease on puro-PLA signal (indicative of Rtn-1C synthesis) 24 hours after axonal KD. Thus, based on some similar phenotypes before and after injury, we consider our results are still compatible with Rtn-1 axonal synthesis being downregulated, but not fully absent (the mRNA is still detected, as described by Taylor 2009). As such, axonal Rtn-1 KD decreased β3-tubulin levels before and after injury according to figure 5B and the improved statistical analysis performed on figure 2E. Similarly, axonal Rtn-1KD significantly increases microtubule growth rate before and after injury according to the current statistical comparisons (Figure 5E). in complement, if β3-tubulin decrease was merely due to unspecific siRNA targeting, it is unlikely that SPTZ treatment should restore β3-tubulin only in the context of axonal Rtn-1 KD (Figure 5B). Although on a different track, the mechanistic relationship between Rtn-1C and Spastin suggested in Figure 7 could make more plausible that a similar phenomenon regarding the control of tubulin levels could be occurring locally in axons. We have now included these considerations in the discussion (lines 535-543).

      To discard off-targets effects, we have now validated a third siRNA sequence (siRNA 3) specifically designed against Rtn-1 and showed that it selectively downregulates Rtn-1C but not β3-tubulin in cultured cortical neurons. Then, following the same experimental frame of figure 3, we performed axonal Rtn-1 KD after injury and observed that siRNA 3 also significantly increases the outgrowth of injured axons (Supplementary figure 2). This suggests that, at least this phenotype, is not product of an off-target effect. Thus, the pharmacological rescue of β3-tubulin levels by SPTZ (Figure 5B) and the Rtn-1C/Spastin co-distribution in heterologous cells, which correlates with preserved microtubules (improved Figure 7), provide converging evidence to suggest that Rtn-1C-Spastin interplay may underly the observed phenotypes in axons.

      Minor comments:

      In Figure 5A, it would be helpful to indicate the border of the axon. The figure is not really convincing.

      Following yours and other reviewer comments, we have analyzed a new set of experiments regarding the STED images of non-injured and injured axons. To eliminate the risk of artifactual descriptions, we have avoided deconvolution and worked directly with raw STED images (Figure 5A). Under these conditions, distribution of Spastin and its intensity in distal axons are not modified by injury, nor those of Rtn-1C and Spastin (Supplementary figure 4). Despite these results, data still supports that both proteins are restricted to similar domains subcellular domains before and after injury.

      Reviewer #3 (Significance (Required)):

      The manuscript uses complex methods to address an interesting cell biological question of relevance to understand axonal growth regulation upon injury. A limitation of the study is the statistical analysis, which triggers some doubts about the reproducibility of the data. Further experiments and the addition of controls would be important to support the claims of the authors.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript investigates the relationship between the endoplasmic reticulum morphogen reticulon-1 (Rtn-1) and the microtubule severing protein spastin in axons after injury. The main message and conclusion of the paper is that local axonal synthesis of Rtn-1 plays a role in regulating the microtubule severing activity of spastin by interacting with spastin and inhibiting its activity. This mechanism would be important after injury by regulating axonal growth.

      The conclusions of the paper are based on the following claims:

      1. Rtn-1 is synthesized locally in axons.
      2. Specific downregulation in Rtn-1 in axons using microfluidic chambers affects microtubules abundance (measured by beta-3 tubulin) and promotes axon growth after injury.
      3. Inhibition of spastin MT-severing activity with a specific drug rescues the growth effect induced by axonal downregulation of Rtn-1.
      4. Rtn-1c interacts with spastin-M87 to limit its MT-severing activity in a cellular system upon overexpression.

      Major comments:

      1. Evidence that Rtn-1 is synthesized in axons comes from two experiments. Initially, the authors show that Rtn-1 siRNA transfection in the axonal compartment of microfluidic chambers reduces Rtn-1 levels in axons, suggesting that there is some local synthesis. Although this method is very attractive, I am concerned about the statistical analysis. The graphs show bars rather than individual data points from the average of a large number of neurons (about 300). The plots also show the SEM instead of the SD, thus covering all the variability that is inherent in this type of experiment. The statistics are probably not performed on the 3 biological replicates, but consider the individual neurons as N. This is obviously not correct, since neurons in an experiment may all be affected by the same technical problem and are not independent replicates. For this reason, I am a bit skeptical about this quantification. Another problem is that the quantification of the fluorescence intensity of the sample does not take the nuclei into account. Are the nuclei removed for analysis? Are the images single planes? Addressing the quantification issues is crucial also for data in Figure 4, where the authors show a different effect of Rtn-1 axonal KD after injury. The second experiment is the Puro-PLA in Figure 6D. This experiment shows an average of 1.5 dots of signal per soma, which is a very low level of translation for this compartment where most of the synthesis should be taking place. In the axons, it is not clear how they calculate the axonal area. Again, the number of dots detected is very low and the physiological significance is questionable. A control with a known mRNA translated in axons would be important. Finally, as an important control, the authors should show the presence of Rtn-1 mRNA by FISH in their experimental system.
      2. The effects on tubulin following Rtn-1 downregulation in axons is potentially very interesting, but the authors should be careful because it could also mean that the axons are suffering. Can they also stain for other cytoskeletal markers?
      3. The results using SPTZ are very interesting and implicate spastin microtubule severing activity in the observed phenotype. In my opinion these experiments however do not prove that "axonal Rtn-1 is indeed promoting the severing of microtubules by spastin", but simply that the blocking spastin activity prevents the appearance of the microtubular phenotype (which appears still with a mysterious mechanism). What happens if they try to stabilize the cytoskeleton by another mean (with taxol for example?). The authors should rephrase this conclusion.
      4. The last experiment (Figure 7) that aims to connect Rtn-1 and spastin function is very artificial, since it is based on overexpression. Why should spastin M87 interact with an ER morphogen? Endogenously it is conceivable that spastin M1 which localizes to the ER would interact with Rtn-1. Moreover, this experiment needs further controls and quantifications. First, it is quite obvious from panel 7C that there is crossover of signal in the two fluorescence channels (see GFP and spastin). Controls need to be shown, where only one of the two fluorescent proteins is expressed and the specificity of the laser is tested. This experiment is based on only 1 cell shown where co-localisation is detected based on a line that is placed in a specific area of the cell. The effects on the microtubular network needs quantification.
      5. What is exactly the model proposed? The title implies that axonal synthesis of Rtn-1 is important during injury, but the data in the paper rather suggest that upon injury the majority of Rtn-1 is not locally synthesized. If the levels of Rtn-1 do not change, why the effect on the microtubules should be specific? Why would a siRNA against Rtn-1 in axons not affect the levels of Rtn-1, but those of tubulin? The authors should be careful, and test other control siRNAs, and Rtn-1 siRNAs, since it is well known even in more simple cellular systems that the toxicity of individual siRNAs can vary greatly.

      Minor comments:

      In Figure 5A, it would be helpful to indicate the border of the axon. The figure is not really convincing.

      Significance

      The manuscript uses complex methods to address an interesting cell biological question of relevance to understand axonal growth regulation upon injury. A limitation of the study is the statistical analysis, which triggers some doubts about the reproducibility of the data. Further experiments and the addition of controls would be important to support the claims of the authors.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Axonal mRNA localization and localized translation supports many neuronal functions and is an important determinant of the regenerative potential of axons after injury. How this works mechanistically remains unclear. The authors present a well performed and technically challenging study in which they identify RTN-1 as a regulator of axonal outgrowth after injury. They provide evidence using experiments in microfluidic chambers that RTN1 is locally synthesized in axons. Interestingly, they identify a (local) interplay between RTN1 and Spastin which affects microtubules and thereby regulates the outgrowth of cortical axons after injury. This study provides an interesting new link between a locally synthesized protein (RTN1) and a microtubule-regulating protein Spastin that is changed upon axon injury. This provides an advance in our understanding in axon regeneration after injury and provides the basis for new studies that can further investigate this interplay. Although interesting, I have several concerns that should be clarified and are needed to substantiate the findings and model presented in this study.

      Major concerns:

      1. In figure 1, the authors provide an analysis of overlapping axonal mRNAs. There are more axonal transcriptome studies and a recent study by von Kugelgen and Chekulaeva (2020; doi: 10.1002/wrna.1590) already performed such an analysis, which included more studies. It would be good to mention this. It can be perceived that studies were now chosen to get the outcome that Rtn-1 is present in all studies. For example, von Kugelgen finds mRNA coding for RTN3, another ER structural protein, as present in 16 out of 20 studies analyzed. That said, the authors present more reasons to look at Rtn-1, so the selection to continue with this protein remains valid but can be written up differently so not to present it as the 'sole' ER-shaping protein consistently present in axonal transcriptomes.
      2. The description of methods is currently insufficient and incomplete and does not allow for reproducibility of this study. For example, different Rtn-1 antibodies seem to be used in this study. Is the same antibody used for staining and WB? There is no listing of any of the antibodies used in the study and which one is used for which technique/experiment. This should be clarified and should be easy to do so in the methods section (antibody name, origin/company, dilution used) to enhance reproducibility of this study. This is not limited to primary antibodies and any information on secondary antibodies, including what was used for STED is completely missing.
      3. The timeline of KD experiments in Figure 2 and 3 are unclear. For the Western blot KD is performed at DIV7 and collected 48 hours later. However, this is not specified for the stainings done in Figure 2C-E. Is this also at DIV7 and then for 48 hours? In figure 3 the siRNA is added at DIV8 (together with axotomy) and outgrowth is measured 24 hours later. Is 24 hours sufficient to achieve knockdown? Is this also what was done for stainings? Later on in Figure 5B, 48 hours of KD is again used. It is unclear what the rationale of these differing timepoints is. Why was this chosen? Is the timeline also the reason for the difference in segment lengths chosen? In Figure 3, there is a significant effect on outgrowth in the KD in the 'mid-range' which is not present in Figure 5.
      4. Could the authors provide a rescue condition for their siRNA (using a siRNA-resistant construct) to show that their siRNA is specific for RTN1. They nicely show the efficiency of the siRNA but not its specificity. This is crucial because if not specific, this will affect a large part of their study. They already have RTN1A and RTN1C constructs available. Such a rescue experiment should ideally also be performed for one or more of their phenotypic experiments, such as the one presented in Figure 3A or 5 to show that the phenotype is really RTN1 dependent. If done by re-expressing either RTN1A or RTN1C, this could provide insightful information on the relevant isoforms.
      5. I find the data presented in Figure 4A/B confusing. Axonal RTN-1 KD does not reduce axonal RTN1 levels but somatic KD does. I understand that this implies most protein comes from the soma and the authors indeed present an explanation that increased somatic RTN1 occurs after axonal KD as a compensation mechanism. However, this can also be interpreted that there is no axonal synthesis of RTN1 after injury and axonal KD has indirect or even aspecific effects. Their model depends on this difference. Their data in Figure 6 could provide supporting evidence if it shows RTN1 puro-PLA after injury. Along these same lines, in Figure 6, they nicely include a compartment control for puro-PLA. It therefore seems doable to include a somatic puromycin control for their axonal puro-PLA, to exclude and diffusion/transport of the newly synthesized peptides. This is especially in light of two recent papers reporting on this possible phenomenon, although these studies were not performed in neurons.
      6. In Figure 5A the authors find an increased co-localization (RTN1/Spastin) after axotomy. From their images, it seems that the amount of Spastin is hugely increased, which would by default increase the chance of (random) colocalization of RTN1 on Spastin. Could the authors comment on this?
      7. In figure 5E and 5F, the condition of scr + SPTZ is omitted. What is the reason for this? The explanation of results in these figures is confusing. The authors report a 'clear trend' in increase in comet track length and lifetime upon addition of SPTZ to axonal RTN-1 KD. This is however not significant. The comparisons that are made afterwards are confusing (e.g. increase in comet lifetime of SPTZ in non-injured axons with RTN1 KD compared to Scr+DMSO and KD + DMSO in injured axons). Their conclusion is axonal RTN-1 synthesis in injured axons (see my concern in the points above on this) governs microtubules growth rate beyond Spastin activity yet blocking Spastin activity still completely blocks the effect of KD on outgrowth.

      Other/minor concerns:

      • The gene ontology analysis in Figure 1A contains the category 'Endoplasmic reticulum'. In this category are mainly ribosomal proteins. Although in a gene ontology analysis these proteins will be included in this category, it is misleading in this respect since they are just as likely to be coming from cytoplasmic ribosomes. Although it cannot be excluded that these are ER-bound ribosomes, not in the last place because a recent study (Koppers et al., 2024, doi: 10.1016/j.devcel.2024.05.005) found ribosomes attached to the ER in axons, I believe the category should be adapted or at the least clarified in the text.
      • Is RTN-1C isoform still an ER-shaping protein or rather an ER protein with alternative functions? The final sentence in the abstract makes a statement that a locally synthesized ER-shaping protein lessens microtubule dynamics. Could the authors provide a clearer description and discussion of the evidence in literature for this? RTN1C has been suggested to perform alternative functions in which case the statement that the local synthesis of an ER-shaping protein is important for axonal outgrowth should be adapted.
      • Is there a difference in RTN1 distribution or levels pre- and post-axotomy?
      • Line 100/101 states 'the interactome of the axonal ER provides...'. To my knowledge there has been no study looking at the interactome of the axonal ER specifically. Surely axonal ER proteins are known but there is a difference.
      • Typo line 160 'localla'
      • In Figure S1 B, please add the DIVs to make it more clear what each graph corresponds to. The legend of S1B states different distances from the cell body but the graph shows distances from the tip.
      • Figure 2C, why does B3 tubulin decrease in soma, aspecific effect of siRNA?
      • What is the rationale on the opposite effect found in outgrowth in Figure 3?
      • Missing word 'we' on line 194
      • Typo line 629 'witmn h', please proofread the entire manuscript carefully.
      • Could the authors comment on why, in Figure 7B/C, GFP only is colocalizing with Spastin-RFP? In general, GFP should be diffusive and not display punctate colocalization with Spastin.

      Significance

      Axonal mRNA localization and localized translation supports many neuronal functions and is an important determinant of the regenerative potential of axons after injury. How this works mechanistically remains unclear. The authors present a well performed and technically challenging study in which they identify RTN-1 as a regulator of axonal outgrowth after injury. They provide evidence using experiments in microfluidic chambers that RTN1 is locally synthesized in axons. Interestingly, they identify a (local) interplay between RTN1 and Spastin which affects microtubules and thereby regulates the outgrowth of cortical axons after injury. This study provides an interesting new link between a locally synthesized protein (RTN1) and a microtubule-regulating protein Spastin that is changed upon axon injury. This provides an advance in our understanding in axon regeneration after injury and provides the basis for new studies that can further investigate this interplay. Although interesting, I have several concerns that should be clarified and are needed to substantiate the findings and model presented in this study.

      The audience for this study will be mainly basic research in the fields of both axonal protein synthesis and axon regeneration. My expertise is in the field of mRNA localization and local protein synthesis.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, the authors focus on the role of Reticulon-1C in concert with Spastin in response to axonal injury. In data mining, they find axonal mRNAs encoding for ER-associated proteins including Rtn-1. They establish a knockdown targeting both Rtn-1 isoforms Rtn-1A and Rtn-1C. They observe decreased beta-3-Tubulin levels in the soma while axonal protein levels are unchanged. In microfluidic devices, they characterise the effect of a compartment-specific Rtn-1 KD on axonal outgrowth in the axonal compartment. The authors quantify axonal outgrowth, seeing increased outgrowth in an axonal compartment-specific Rtn-1 KD, while the effect seems to be reversed when applying the KD construct in the somatic compartment. When focussing on the axonal growth cone, they find the Rtn-1 KD shows differences in several morphological features of the growth cone. They find an increase in Tubulin levels in an axonal compartment-specific, but a decrease in a somatic compartment-specific Rtn-1 KD. Colocalisation of Rtn-1C and Spastin is shown to be monolaterally increased following axotomy. Combining axotomy with the Rtn-1 KD shows increases in dynamic microtubule growth rates and track lengths. In another model system, neuron balls, they show Rtn1-C, but not Rtn1-A to be present in the axon. In a puro-PL assay they also show it can be synthesised in the axonal compartment. To investigate the mechanism enabling the cooperation between Spastin and Rtn-1C, they move to a cell line model in which they see a correlating distribution between Spastin and Rtn-1C but not Rtn-1A. Finally, they use in silico modelling to speculate on binding between Spastin domains and Rtn-1 isoforms.

      Major comment:

      The rationale behind the work is convincing, however some interpretations are presented as more robust than some data allow. Most notably, while the interaction between Rtn-1 and Spastin has been shown prior to this study, it is only presented here through in silico analysis. In figure 5, an increase in the growth rate of dynamic microtubules is observed in either a Rtn-1C KD or by using a Spastin-inhibitor. Due to a described increase in colocalisation between Rtn-1C and Spastin (5A), the increase in growth rate is displayed as caused by Rtn-1 promoting Spastin's severing ability. This result might however be correlative. Further in the injured samples, Spastin-levels seemingly increase (in the representative images) and it is thus not surprising that the level of Rtn-1C colocalising with Spastin increases as well. This might not be indicative of a cooperation and further experimental evidence are required.

      Other comments:

      • Generally, graphs would benefit from individual values plotted as well as the summary. Font sizes and types (but rarely) are sometimes inconsistent. Proteins should be consistently written (capitalised or not).
      • Table 1 and figure 1 present data collected from a vast amount of resources. It should be highlighted that datasets from which data was obtained includes many different models, different DIVs and neuronal cell types. Figure 1B may benefit from a different colour scheme. "Ex-vivo" should be "Ex vivo". For "ER mRNAs are a relevant category" it is not described what "relevant" would mean in this context. The title might remove this small part or describe it in the text. It should be described how it is decided that mRNAs are "common".
      • Figure 2: add description to y-axis to describe what fold change is displayed, applies to multiple figures. Will improve readability of the figures. In 2C, the ROI showing neuronal somata should be increased to show part of the axon and not cut off the soma.
      • Figure 3: Three out of four axonal compartments seem to be comprised of dying or damaged axons. Especially the axonal KD scrambled image. It should be ensured that neuronal cultures are healthy. Typo in "intersections". The schematic of 3B is a great addition to explain the graphs above. Perhaps it could be a bit refined as it is currently hard to see whether this is a neuron or a growth cone without context. Maybe show where the axon connects to the depicted growth cones and change the third icon which looks like it was crossed out. Small formatting issues: remove additional space bar before "Figure 3." And add after "Bar"
      • Figure 4: If not misunderstanding what is depicted, in 4A and B, different lookup tables are used to depict the same signal. Only one of each images is necessary. Do the axons have more tiny branches in the Rtn-1 KD condition in 4A? Unclear why Rtn-1 levels are increased in the Rtn-1 KD (4C), please clarify.
      • Figure 5: It may be easier to understand what "axotomy" samples are if just referred to as "injured" as later in the same figure. The procedure could also very briefly be explained in the results. 5C should depict AUC in µm2 not µm. 5D Spastin is barely visible, brightness and contrast should be adjusted to enhance visibility.
      • Figure 6: It should be made clear why it is necessary to switch to another model system just for 6A, please indicate this in the text. PCR bands seem very pixelated, check the quality. It is unclear why soma genes/proteins were only tested with either PCR or WB others with both. Rtn-1C and Rtn1-A should be presented in the same order in the PCR and WB panel. Correct "Rtn1-1A" typo. In 6D, 1.5 dots per soma seems like a low number. When normalised to the area the soma vs the axon occupies, the compartmentalisation does not work? May be it make sense to refine analysis or apply puromycin in the somatic compartment and analyse the axonal compartment as comparison?
      • Figure 7: 7A shows two images depicting the same information that may not be needed. Can probably be removed. In 7B there is no negative (or any) correlation between Spastin levels and Tubulin, however later it is mentioned that Rtn-1C transports Spastin thus causing a decrease in Tubulin at certain locations? It is nclear if Spastin levels vary intensely between different samples. Mean intensity of the somatic area may be beneficial to rule this out. 7B Tubulin on the right top panel seems to have a decrease in Tubulin levels which is not visible due to the Y axis of Tubulin being set to a different range than the middle and lower panel. The average of line scans from multiple cells may be helpful to determine whether there is indeed no colocalization between Rtn-1A and Spastin. The provided representative images seem to show similar degrees of colocalization between Spastin and Rtn-1A/C.

      Results:

      • It would be helpful to reiterate the hypothesis at the start to ease the reading flow.
      • There seems to be minor redundancy in lines 132-138.
      • There are several spellings, proof-reading is recommended. For example, in line 136 should be "promotes". 160 "localla", 192 should be "the actin cytoskeleton".,194 should be "we first examined", 195 should be "Different", 223 "using", 259 "axons". ...
      • 154-155: Unclear, why the lower MW Rtn-1C was seen as more important.
      • 167 results of 2E not stated before interpreting them.
      • 181 would suggest "outline" instead of "perimeter".
      • 183-184 "longest shortest path" is a confusing term.
      • figure 4B should be referenced earlier in the sentence.
      • 243-244 may be correlation. Rtn-1 and Spastin do not necessarily interact so that this result is achieved.
      • 246: In figure 1 the KD seemed to have an effect on both Rtn-1 isoforms, why not here anymore? 259 "axons". 284 "counteract" instead of "suppress"?
      • 485: rephrase as the interaction between Rtn-1C with Spastin has not been shown directly in these experiments.

      Methods: 535 "in PBS". 543 citation error. 689-699 is it necessary to add a gaussian blur?

      References: Mannan, A U et al. appears twice in the citation list (36 and 44).

      Significance

      Overall, this manuscript describes novel fundings which will be interesting to the neuronal cell biology community and scientists working on the field of neuronal injury and regeneration. It is well structured, and the data are mostly well presented but sometimes conclusions are over-interpreted. However, several points need to be addressed in a more convincing way.

    1. Reviewer #2 (Public review):

      Chalamalasetty et al. investigate the regulatory circuit of signaling molecules and transcription factors that drive the fate of neuromesodermal competent progenitors (NMCs). NMCs contribute to Sox2-positive spinal cord and Tbxt/Bra-expressing somitic mesoderm, and this choice is governed by the interplay between Wnt3a and Fgf signaling. The authors discovered that the transcription factors SP5 and SP8 participate in this process. Mouse genetics, in vivo development, and transcription factors profiling point to a model where SP5 and SP8 directly regulate Wnt3a expression to foster Tbxt-marked mesoderm formation at the expense of Sox2-marked neural ectoderm. Mechanistically, SP5/8 bind to an enhancer which the authors characterize: its activity depends on the presence of SP5, CDX2, TCF7, and TBXT binding sites, and it is activated only in primitive streak cells at E7.5, in NMP, and in caudal and somitic mesoderm, underscoring the tissue and stage-specific nature of this Wnt3a enhancer.

      Moreover, the authors find that SP5/8 likely regulate the TCF7 association with the chromatin and compete for its binding to the TLE repressor.

      The study is extensive, compelling, and well written. The combination of in vivo evidence with single-cell transcriptomics, transcription factors profiling, and in vitro regulatory element characterization is notable and builds a convincing picture of the action of SP5/SP8.

      Here, I provide a series of comments and questions that, if addressed and clarified, could, in my opinion, improve the study.

      (1) While Sp5 and Sp8 are both present in NMCs, their expression does not fully overlap. Sp5 is also detected in caudal and presomitic mesoderm, notochord and gut, while Sp8 overlaps with Sox2 in neural progenitors of the spinal cord and brain (Fig. 1D). Accordingly, Sp8 expression is also activated by the neural-promoting RA+Fgf. It is not easy for me to reconcile this non-fully overlapping expression pattern - and in particular the overlap of Sp8 and Sox2 - with the presumed redundancy (or similarity of function) described later. Sp5/8 dko NMCs show reduced Tbxt and expanded Sox2, indicating that SP8 also represses Sox2 or neural fate, an observation confirmed by Sp8 overexpression (Figure 4c). What is the explanation for this, and is the function of SP8 in Sox2-positive neural progenitors different from its Wnt3a-sustaining role in NMCs? Or what am I missing?

      (2) I suggest that the authors show relevant ChIP-seq peaks in Figure 3 to lend credibility to the complicated overlapping Venn diagrams. I consider visual inspection of peak tracks as primary quality control of this type of experiment. A good choice could be the cis-regulatory elements at Sp5, Sp8, Tbxt, Cdx1, 2, 4 bound by TBXT and either CDX2, SP5, or SP8 (now referring to the Venn diagrams and the annotated peak table). On ChIP-seq visualization, in reference to Figures 5 and 7, I also suggest that the authors show the tracks of a negative control (IgG, non-related antibody, or better anti-flag in Sp5/8 dko). While I do not doubt the validity of these experiments, there are peaks in these figures bound by all factors tested that could be suspicious (even though, admittedly, they look like genuinely good TF peaks). A negative track would clearly show beyond any doubt that these are not suspect regions of positive unspecific signal caused by open chromatin, excessive cross-linking, or antibody cross-reaction.

      (3) SP5 here is found as a direct inducer of Wnt3a expression, and accordingly positive regulator of Tbxt and mesoderm, caudal development. I find this in partial contradiction with a finding by the Willert group (PMID: 29044119). They show that "genes with an associated SP5 peak, such as SP5 itself, AXIN2, AMOTL2, GPR37, GSC, MIXL1, NODAL, and T, show significant upregulation in expression upon Wnt3a treatment in SP5 mutant cells". There, essentially, SP5 inhibits Wnt target genes. While the authors are aware of this and cite Huggins et al., I find that this deserves a better discussion addressing how opposite functions could be sustained in different contexts, if these really are different cellular contexts in the first place, or if this could result from different methodologies.

      (4) The gastruloid experiment is nice, but I wonder whether there is any marker that the authors can use to show that other features of the gastruloids respond accordingly. For example, is the Sox2 expression domain expanded? And is there any unaffected marker to emphasize the specificity of the decreased Tbxt and Cdx2?

      (5) SP5/8 seems to enhance the TCF7 occupancy at WRE. And then, SP5/8 appears to counteract the presence of TLE repressor associated with TCF7. While these two mechanisms are interesting, they are not necessarily interconnected. According to the still-established view, TCF7 should be associated with WRE even in the absence of the Wnt signal, when TLEs are also present on the locus. One could expect that SP5 competes with TLE, to decrease its presence on TCF7-bound loci, leaving the abundance of TCF7 binding unchanged. Yet, the authors also observe that the TCF7 association changes. What is the mechanism implied? Do they perhaps consider a TCF7L1 > TCF7 switch, and if so, what evidence exists for this?

      (6) Along the same line as above, I wonder whether beta-catenin binding is also enhanced at these sites? Any TCF/LEF would require beta-catenin for gene upregulation.

      (7) The authors write that "Small Tle peaks were identified at these WREs in WT cells, demonstrating that both repressive Tle and activating Tcf7 could be detected at active genes". However, ChIP-seq is a population assay, and it is possible - more plausible, in fact - that cells displaying TLE binding are not expressing the target genes.

    1. The article “Use of diverse data sources to control which topics emerge in a science map” aims to analyze the effects of different data sources on topic clustering bias in science maps. For this purpose, the clustering effectiveness of different topic categories is analyzed based on different traditional and non-traditional data sources.

      (1) contribution to existing literature

      The present research is well embedded in the existing body of literature and builds on the study Which topics are best represented by science maps? An analysis of clustering effectiveness for citation and text similarity networks by Bascur, Verberne, van Eck, and Waltman (2024). That study explored the extent to which science maps can successfully cluster documents that address the same topic - a concept referred to as clustering effectiveness. This metric serves as an indicator of the thematic precision of clustering approaches. Bascur et al. (2024) found that clustering effectiveness varies depending on the topic domain: documents related to certain topics, such as diseases, were more accurately clustered than those related to others, such as geography. Building on these findings, the present study investigates whether the clustering effectiveness for documents on the same topic is influenced by the choice of data source, and whether this effectiveness can be systematically adjusted or optimized through the selection of that source.

      (2) major strengths and weaknesses

      The article’s ideas and arguments are presented with clarity and precision. Its structure follows a classic and well-established format - introduction, background, methods, results, discussion, and conclusion - which makes it easy to follow. As a reader, I never lost the thread; the narrative remains coherent and accessible throughout. The current state of research is conveyed in a thorough, well-reasoned, and nuanced manner. Particularly noteworthy is the detailed introduction to the topics of science maps based on diverse sources and comparing clustering solutions of different networks. This contextualization is both comprehensive and essential for understanding the research that follows.

      The document selection is highly extensive (4,142,511 documents) and well-justified. The rationale for which documents are included in the study is clearly and convincingly presented. All selection criteria are explained in detail in Section 3.1.

      The introduction clearly explains the rationale for using non-traditional data sources alongside traditional data sources, and the justification is both logical and easy to follow. The external data sources are introduced and described in Section 3.2. The procedures for building the different networks (Sections 3.3 and 3.4), as well as the clustering approaches (Section 3.5), are also thoroughly explained. The topics and topic categories analyzed in the study are presented and justified in detail in Section 3.6. To evaluate how well different topics are represented within the clustering networks, the study employs the concept of clustering effectiveness. The relevant calculations are described in Section 3.7.

      The article presents its complex results in a well-structured and sensible tabular format. Figure 2 provides an example to illustrate how the results are displayed. Table 3 reports all detailed results, while Table 4 offers a summary, and Table 5 draws conclusions on which network performed best for each topic. The tables and their captions are extensive and may seem overwhelming at first glance. However, the article makes it clear that this level of detail is both intentional and necessary. The thorough descriptions guide the reader through the results and enhance comprehension. Rather than being a weakness, the comprehensive presentation reflects the authors’ careful and rigorous approach.

      Regarding additional strengths of the article, I would like to highlight and support those identified by the authors themselves. This study represents a clear advancement over the 2024 publication. By focusing on a single metric—purity, rather than also including inverse cluster number—the evaluation and interpretation of results have been significantly simplified, and comparability has improved. Whereas the earlier study only allowed comparisons between cluster solutions based on identical document sets and similar cluster sizes, the current study enables comparisons across different networks, even when they involve varying documents and cluster structures. A notable innovation in this article is the introduction of purity profiles, which effectively illustrate how clearly topic clusters would be perceived by users navigating the science map.

      In addition to highlighting the strengths of their work, the authors also acknowledge three key limitations. These include the absence of a specified minimum cluster size, the combination of bipartite and non-bipartite networks, and the potential inaccessibility of certain data sources for other researchers (e.g., due to paywalls such as those associated with the Twitter API). Each of these limitations is clearly presented and discussed in the article. The authors provide thoughtful reasoning on the impact of these constraints and explain how they have addressed them within the scope of their study.

      (3)    suggestions for improvements

      I have no suggestions for improving the article.

      (4)    data and code availability/ research ethics/ MetaROR policies

      The research data is available on Zenodo in accordance with the principle as open as possible, as closed as necessary. Due to legal restrictions, the raw data used in the experiments cannot be shared. However, the code used to conduct the experiments and generate the results is provided, along with a summary of the data utilized.<br /> This ensures transparency and allows others to understand the methodology and replicate the results, even in the absence of the original raw data.

    2. In this article, the authors present a study using different networks from various data sources to measure differences in gathering scholarly document topics and to show which networks provide the best information to represent the scientific topics considered appropriately. The work is built on a previous contribution and analyses networks obtained from six sources: scholarly document authors, Facebook users, Twitter users and conversations, patents, and policy documents. These networks are also accompanied by other networks, i.e. the text similarity network and the citation network, that are mainly used for comparison purposes.

      The work particularly interests the scholarly community, aiming to work with science map generation. However, some passages need further explanation to be clear to the reader.

      1. In the abstract, there is a mention of traditional and non-traditional data sources. While in the text of the article there are, indeed, some clarifications, it would be ideal to briefly explain in the abstract what the authors refer to these terms, since it is not immediately clear what is a traditional data source in the context of topic identification.

      2. In the introduction, the authors anticipate the outcomes of a previous work they have conducted on a similar topic. They claim that some topics are well-represented in maps based on citation links and text similarity, while others are not. However, it is not clear which sources they have used to get to this claim, and it is also not evident what the main difference is that characterises the current work compared to the previous one.

      3. In section 3, the authors introduce all the methods and materials used for their analysis. Despite the fact that some of the material cannot be shared since it is behind a paywall (e.g. the Web of Science data), by reading the section, it is not clear that all the code developed and the data obtained from the analysis have been published on Zenodo. While it is okay to address this aspect in the appropriate section at the end of the article, I would suggest to anticipate this information at the beginning of section 3, citing the Zenodo record appropriately and clarifying which of material is not included in that record, thus explaining that the full reproducibility of the experiment cannot be conducted.

      4. Considering all the external sources of networks, it is not clear what the datetime window of each source is - are all these sources containing information from the year of publication of the oldest article in the document set considered to 2024?

      5. As far as I understood from the formula in section 3.7.1, the Purity is always calculated against a particular topic M. Thus, why not refer to such "M" in the formula definition, defining it in a function-like way Purity(N, M)? In addition, still in this section, it is not clear how the N clusters considered are selected. A running example of Purity calculation would probably help the reader here.

      6. In section 3.7.2, the denominator of the formula is set to 5. However, it is unclear why such a number is sensitive for the calculation presented. Why not 6 or 7? Why not 3? I think the authors should clearly justify the choice of such a denominator by bringing in explicit evidence.

      7. In section 3.7.3, it is not entirely clear what the difference is between topics and topic categories.

      8. In the discussions, it would be good to extend a bit on the work's limitation and envision possible paths for future works in the area. A few points that I would love to see discussed in detail:

        • The analysis has been done by using sources that may have changed drastically in the past months/years - e.g. Twitter that, after becoming X, has seen a series of abandons from the academics towards more open (in a broad sense) platforms and networks (e.g. Mastodon and, more recently, BlueSky). Would it be possible to gather the necessary data from these platforms to run the study again? If yes, would it be possible to download them? If not, should we consider these sources unreliable for scientific purposes and, if so, what preconditions should be in place for their reliability? Considering the present situation, what is the relevance of the results obtained with the data gathered from Twitter (now X)?

        • The authors transparently claim that some of the data used (e.g. Web of Science data) are not freely available to the reader, thus preventing the full replication of the study. Is it possible to substitute these closed sources with others offering open research information? For instance, OpenCitations for gathering the citation network (full disclosure: I'm director of OpenCitations), PubMed and PubMed Central for gathering titles and abstracts of the article considered, etc.?

        • The core set of scholarly documents considered are primarily from the biomedical domain since the authors considered only those with a PubMed identifier specified. While the results shown are sensitive for this domain, how much does the approach the authors presented scale also in other scholarly areas, e.g. Social Science and Humanities? Is it possible to speculate that the approach presented is discipline-agnostic? Is there any evidence for such a claim?

      Some final remarks:

      A. The figures should be closer (i.e. maximum on the next page) to the place they are mentioned the very first time.

      B. The research question introduced in the article is introduced in section 1, and then it is not explicitly mentioned anymore in the text. It would be ideal to add an explicit reference to that question when the authors present appropriate evidence to answer it (e.g. in section 4) and to recall the answer to that question in the conclusion of the paper.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      This manuscript described the translational responses to single and combined BCAA shortages in mouse cell lines. Using Ribo-seq and RNA-seq analysis, the authors found selective ribosome pausing at codons that encode the depleted amino acids, where the pausing at valine codons was prominent at both a single and triple starvations whereas isoleucine codons showed pausing only under a single depletion. They analyzed the mechanisms of the unexpected selective pausing and proposed that the positional codon usage bias could shape the ribosome stalling and tRNA charging patterns across different amino acids. They also examined the stress responses and the changes in the protein expression levels under BCAA starvation.

      The manuscript was well-written, and the findings are interesting, especially their model that positional codon usage bias could be a regulator of ribosome pausing and tRNA charging levels. Although different translational responses to distinct amino acid starvation have been widely documented, the positional codon usage bias is an interesting aspect. The manuscript's central message could have been made clearer. The authors may consider emphasizing this point more explicitly in the abstract. The rich multi-omics dataset in this work provides valuable resources for the translation field.

      We thank the reviewer for the thoughtful and positive evaluation of our work.

      Major comments

      1. The abstract may need to be revised since it is hard to immediately catch the authors' main point. If the authors regard this work as a resource paper, the current version is fine. But it could be better to point out the positional codon usages the authors found, which is a strong point of the current manuscript.

      Response: We thank the reviewer for highlighting the importance of positional codon usage, which indeed represents a key finding of our study. We revised the abstract, and we now emphasize this aspect more clearly. However, in response to review #2, we have framed the observed positional effects and the idea of an elongation bottleneck as one possible contributing mechanism among others and relate it specifically to the attenuation of isoleucine-specific stalling under triple starvation.

      1. Page 18 "Beyond these tRNA dynamics, our data also highlight the importance of the codon positional context within mRNAs, indicating that where a codon is located within the CDS can influence both the extent of ribosomal stalling and overall translation efficiency during nutrient stress." This idea is interesting. To what extent the authors think this could be generalized? The authors may discuss whether they think their proposed model is specific to the different ribosome stalling patterns between valine and isoleucine codons or generalized to other codon combinations. For example, the positional codon usage bias will be different among different organisms, and are there any previous reports on ribosome behaviors that align with their model?

      Response: We thank the reviewer for raising these important points. While our study primarily focuses on the differential stalling patterns of valine and isoleucine codons, we believe the underlying principle, that the position of codons within the CDS can modulate the extent of ribosome stalling, may under very specific circumstances extend beyond this amino acid pair. We expect this positional effect to be potentially relevant for combinations in which one amino acid has considerable enrichment near the 5′ end of coding sequences, coupled with starvation-sensitive tRNA isoacceptors, while the other does not. In our case, valine meets these criteria (see Fig. S11A and Fig. 6). In contrast, isoleucine and leucine codons, although also relatively frequent, show more variable positional distributions and are both decoded by isoacceptors that appear more resistant to starvation, as illustrated in Fig. 6 and reported for mammals and bacteria in Saikia et al. 2016; Darnell, Subramaniam, and O’Shea 2018; Elf et al. 2003; Dittmar et al. 2005. To explore the generalizability of this model, we have now included a transcriptome-wide analysis of codon position biases in mouse for all codons in the revised manuscript (Supplementary Figures 10 and 11). This analysis may serve as a basis to identify additional candidate codons for future studies. Furthermore, we now mention in the Discussion that amino acids with similar properties to valine regarding their positional distribution and tRNA isoacceptors, such as phenylalanine, and glutamine, whose tRNA isoacceptors are predicted to be fully deacylated under their respective starvation in bacteria (Elf et al. 2003), could be promising candidates for testing this model, in combination with amino acids, whose tRNAs are expected to remain partially charged under starvation or to be depleted at the start of the CDS such as i.e. His (Supplementary Fig.11C).

      Even if the authors think this model can be applied to BCAA starvation, would it be possible to explain the different isoleucine codon responses between single and double starvation? The authors may discuss why the ribosome stalling at isoleucine AUU and AUC codons was slightly attenuated under double starvation. And how about the different leucine codon responses among single, double, and triple starvations, although the pausing is not as strong as isoleucine and valine codons?

      Response: Regarding the attenuated isoleucine stalling under double starvation, we believe this is primarily due to stronger inhibition of the mTORC1 pathway when leucine is co-depleted (i.e., in the double starvation condition; Fig. 2D–F). This results in a more substantial suppression of global translation, reducing overall tRNA demand and thereby mitigating stalling (Darnell, 2018). A similar effect may explain the only mild leucine codon stalling observed under single leucine starvation, which also triggers strong mTORC1 inhibition and reduced initiation. In contrast, triple starvation does not suppress mTORC1 to the same extent, and thus reduced initiation alone cannot explain the absence of leucine codon stalling. Instead, we propose that additional features, such as the relative sensitivity of tRNA isoacceptors to starvation and their aminoacylation dynamics, must be considered. Valine tRNAs, for example, are known to be highly sensitive and become strongly deacylated under starvation in bacteria (Elf et al. 2003), a pattern that we also find in our own data (Fig. 6). Leucine tRNAs, by contrast, appear more resistant, possibly due to better amino acid recycling or isoacceptor-specific differences in charging kinetics, though further validation would be needed. However, combined with the strong stalling at 5′-enriched valine codons, this could reduce downstream ribosome traffic and limit exposure of leucine codons, thus preventing stalling. However, our new analysis of the positional relationship between valine and leucine codons within individual transcripts (now shown in Supplementary Figure 11B) did not reveal as strong a pattern as we observed for valine and isoleucine codons. We now discuss these points and their implications in the revised Discussion.

      Experimental validation using artificial reporters carrying biased sequences may also be considered.

      Response: We appreciate the reviewer’s suggestion. In fact, we explored this experimentally using a dual-fluorescent reporter system (GFP–RFP) (Juszkiewicz and Hegde 2017) containing consecutive Val or Ile codons. However, the constructs yielded variable and non-reproducible results under starvation conditions. In addition, testing the role of codon position would require placing the same codons at multiple defined positions within a single transcript and performing ribosome profiling directly on the reporter. This type of targeted experimental validation is technically challenging and falls beyond the scope of the current study. We now mention this explicitly in the revised Discussion as an interesting direction for future work.

      1. Page 13 "Moreover, we noticed that DT changes extend beyond the ribosomal A-site, including the P-site, E-site, and even further positions (Supplementary Fig. 2A), consistent with other studies on single amino acid starvation 39 (Supplementary Fig. 2B-C)." Could the widespread DT changes be due to Ribo-DT pipeline they used or difficulties in offset determination? Indeed the authors showed that this feature was found in other datasets, but it seems that the datasets were processed and analyzed in the same way as their data. The original Ribo-DT paper (Gobet and Naef, 2022, Methods) also showed some widespread DT changes even from RNA-seq. Another analysis method like the codon subsequence abundant shift as a part of diricore analysis (Loayza-Puch et al., 2016, Nature) did not show that broad changed regions. The authors are encouraged to re-analyze the data sets using different methods.

      Response: We agree with the reviewer that the fact that DT changes beyond the ribosomal A-site is puzzling, but this has already been seen in other papers using other approaches (Darnell, Subramaniam, and O’Shea 2018). To validate that this shift is not due to our A-site assignment, enrichment analysis, or DT method, we applied the Diricore pipeline to our Ribo-Seq data. The output of the pipeline provides either 5’-end ribosome density or “subsequence” analysis using an A-site offset for each read size based on the metagene profile at the start codon. Both analyses show the same enriched codons across the different conditions as in our analyses, and the broad shift is similar, with the maximum signal at E, -1 position (Fig. R1).

      1. Page 13 "Intriguingly, only two of the three isoleucine codons (AUU and AUC) showed increased DTs upon Ile starvation (p < 0.01), while just one leucine codon (CUU) exhibited a modest but significant DT increase (p < 0.01) under Leu starvation (Figure 1A-B, Supplementary Figure 2A)." How can the authors explain the different strengths of ribosome pausing at Ile codons under Ile and double starvation? The AUA codon did not show any pausing under either of the starvation conditions. Throughout the manuscript, the authors mainly describe the difference between amino acids but it is desirable to discuss the codon-level difference as well.

      Response: Thank you for raising this point. The observed differences in stalling between the isoleucine codons can likely be explained by differences in tRNA isoacceptor charging and positional bias within transcripts. The AUA codon is decoded by a distinct tRNAIle isoacceptor (tRNAIleUAU), which, according to our tRNA charging data (Fig. 6), remains largely charged during Ile starvation. This observation aligns with previous reports suggesting that this isoacceptor is more resistant to starvation-induced deacylation in mammalian cells and bacteria (Saikia et al. 2016; Elf et al. 2003). In contrast, the AUU and AUC codons are primarily decoded by the tRNAIleAAU isoacceptor, which we find to be strongly deacylated under Ile starvation, likely contributing to the observed codon-specific ribosome pausing. Additionally, we found that the AUA codons are relatively rare in general and particularly underrepresented near the 5′ ends of coding sequences. Our new spatial analysis (now included in Supplementary Figure 11B) confirms that AUA codons tend to occur downstream of AUU and AUC codons within transcripts. This potentially further reduces stalling on these codons and further diminishes their apparent DT increase under starvation. In order to better explain these important points, we have now expanded the codon-level discussion of these differences in the revised manuscript.

      1. Page 13 "We examined the effects of single amino acid starvations (-Leu, -Ile and -Val), as well as combinations, including a double starvation of leucine and isoleucine (hereafter referred to as "double") and a starvation of leucine, isoleucine, and valine ("triple"), allowing us to identify potential non-additive effects." The different double starvations, isoleucine and valine, and leucine and valine, will further support their hypothesis on the effects of the positional codon usage bias on ribosome pausing and tRNA charging patterns. Although this could be beyond the scope of the current manuscript, the authors are encouraged to provide a rationale for the chosen combination.

      Response: Our experimental design evolved stepwise: we initially focused on leucine and isoleucine depletion as we found that despite their structure similarity these had respectively short and long dwell times in our previous work in the mouse liver (Gobet et al. 2020). Valine was included at a later stage to cover all the BCAAs. At the time, we did not anticipate valine to yield particularly striking effects in cells, and therefore we did not include systematic pairwise depletions involving valine. However, the strong and unexpected stalling observed at valine codons, especially under triple starvation, became a central aspect of the study. Thus, we agree that additional combinations, such as Leu/Val or Val/Ile, could be informative and now mention this in the Discussion as a potential direction for future studies.

      Minor comments

      Page 16 "these results imply that BCAA deprivation lowers protein output through multiple pathways: a combination of reduced initiation, direct elongation blocks (stalling), and possibly an increased proteolysis" This conclusion is totally right but may be too general. Could the authors summarize BCAA-specific features of the events including reduced initiation, stalling, and proteolysis that all contribute to protein outputs? This is not well discussed in the latter sections including Discussion.

      Response: We thank the reviewer for this helpful suggestion. We agree that the original statement was too general and have revised the relevant section to more clearly delineate the distinct responses observed under each BCAA starvation condition. Specifically, we now summarize that valine starvation is characterized by strong, positionally biased ribosome stalling; leucine starvation primarily impacts translation initiation, likely via mTORC1 repression; and isoleucine starvation shows a mixed phenotype, with features of both impaired initiation and codon-specific elongation delays. We also clarify that while protein stability or degradation may contribute to the observed changes in protein output, our current data do not allow for quantitative assessment of proteolytic effects (e.g., changes in protein half-life). Therefore, we refrain from making direct quantitative conclusions about the differential modulations of proteolysis and instead focus our discussion on the translational mechanisms supported by our data.

      Reviewer #1 (Significance):

      The manuscript was well-written, and the findings are interesting, especially their model that positional codon usage bias could be a regulator of ribosome pausing and tRNA charging levels. Although different translational responses to distinct amino acid starvation have been widely documented, the positional codon usage bias is an interesting aspect. The manuscript's central message could have been made clearer. The authors may consider emphasizing this point more explicitly in the abstract. The rich multi-omics dataset in this work provides valuable resources for the translation field.

      We thank the reviewer for the encouraging comments and share the view that positional codon-usage bias is an important result; accordingly, we now underscore this point explicitly in the revised Abstract. We also emphasise that our other observations are, to our knowledge, novel: only a handful of multi-omics studies have combined ribosome-pausing profiles with direct tRNA-aminoacylation measurements, and none has systematically examined multiple amino-acid-deprivation conditions as presented here.

      Reviewer #2 (Evidence, reproducibility and clarity):

      This study examines the consequences of starvation for the BRCAAs, either singly, for Leu & Ile, or for all three simultaneously in HeLa cells on overall translation rates, decoding rates at each codon, and on ribosome density, protein expression, and distribution of ribosome stalling events across the CDS for each expressed gene. The single amino acid starvation regimes specifically reduce the cognate intracellular amino acid pool and lead to deacylation of at least a subset of the cognate tRNAs in a manner dependent on continuing protein synthesis. They also induce the ISR equally and decrease bulk protein synthesis equally in a manner that appears to occur largely at the initiation level for -Leu and -Val, judging by the decreased polysome:monsome ratio, but at both the initiation and elongation levels for -Ile-a distinction that remains unexplained. Only -Leu appears to down-regulate mTORC1 and TOP mRNA translation.There is a significant down-regulation of protein levels for 50-200 genes, which tend to be unstable in nutrient-replete cells, only a fraction of which are associated with reduced ribosome occupancies (RPFs measured by Ribo-Seq) on the corresponding mRNAs in the manner expected for reduced initiation, suggesting that delayed elongation is responsible for reduced protein levels for the remaining fraction of genes. All three single starvations lead to increased decoding times for a subset of the cognate "hungry" codons: CUU for -Leu, AUU and AUC for -Ile, and all of the Val codons, in a manner that is said to correspond largely to the particular tRNA isoacceptors that become deacylated, although this correspondence was not explained explicitly and might not be as simple as claimed. All three single starvations also evoke skewing of RPFs towards the 5' ends of many CDSs in a manner correlated with an enrichment within the early regions of the CDSs for one or more of the cognate codons that showed increased decoding times for -Ile (AUC codon) and -Val (GUU, GUC, and GUG), but not for -Leu-of which the latter was not accounted for. These last findings suggest that, at least for -Val and -Ile, delays in decoding N-terminal cognate codons cause elongating ribosomes to build-up early in the CDS. They go on to employ a peak calling algorithm to identify stalling sites in an unbiased way within the CDS, which are greatest in number for -Val, and find that Val codons are enriched in the A-sites (slightly) and adjacent 5' nucleotides (to a greater extent) for -Val starvation; and similarly for Ile codons in -Ile conditions, but not for -Leu starvation-again for unknown reasons. It's unclear why their called stalling sites have various other non-hungry codons present in the A sites with the cognate hungry codons being enriched further upstream, given that stalling should occur with the "hungry" cognate codon in the A site. The proteins showing down-regulation are enriched for stalling sites only in the case of the -Val starvation in the manner expected if stalling is contributing to reduced translation of the corresponding mRNA. It's unclear why this enrichment apparently does not extend to -Ile starvation which shows comparable skewing of RPFs towards the 5'ends, and this fact diminishes the claim that pausing generally contributes to reduced translation for genes with abundant hungry codons. All of the same analyses were carried out for the Double -Ile/-Leu and Triple starvations and yield unexpected results, particularly for the triple starvation wherein decoding times are increased only at Val codons, skewing of RPFs towards the 5' ends of CDSs is correlated only with an enrichment for Val codons within the early regions of the CDSs, and stall sites are enriched only for Val codons at nearly upstream sites, all consistent with the finding that only Val tRNAs become deacylated in the Triple regime. To explain why only Val tRNA charging is reduced despite the observed effective starvation for all three amino acids, they note first that stalling at Val codons is skewed towards the 5'ends of CDS for both -Val and triple starvations more so than observed for Ile or -Leu starvation, which they attribute to a greater frequency of Val codons vs Ile codons in the 5' ends of CDSs. As such, charged Val tRNAs are said to be consumed in translating the 5'ends of CDSs and the resulting stalling prevents ribosomes from reaching downstream Ile and Leu codons at the same frequencies and thus prevents deacylation of the cognate Ile and Leu tRNAs. It's unclear whether this explanation is adequate to explain the complete lack of Ile or Leu tRNA deacylation observed even when amino acid recycling by the proteasome is inhibited-a treatment shown to exacerbate deacylation of cognate tRNAs in the single amino acid starvations and of Val tRNA in the triple starvation. As such, the statement in the Abstract "Notably, we could show that isoleucine starvation-specific stalling largely diminished under triple starvation, likely due to early elongation bottlenecks at valine codons" might be too strong and the word "possibly" would be preferred over "likely". It's also unclear why the proteins that are down-regulated in the triple starvation are not significantly enriched for stalling sites (Fig. 5B) given that the degree of skewing is comparable or greater than for -Val. This last point seems to undermine their conclusion in the Abstract that "that many proteins downregulated under BCAA deprivation harbor stalling sites, suggesting that compromised elongation contributes to decreased protein output." In the case of the double -Ile/-Leu starvation, a related phenomenon occurs wherein decoding rates are decreased for only the AUU Ile codon and only the AAU Ile tRNA becomes deacylated; although in this case increased RPFs in the 5' ends are not correlated with enrichment for Ile or Leu codons and, although not presented, apparently stall sites are not associated with the Ile codon in the double starvation. In addition, stalling sites are not enriched in the proteins down-regulated by the double starvation. Moreover, because Ile codons are not enriched in the 5'ends of CDS, it doesn't seem possible to explain the selective deacylation of the single Ile tRNA observed in the double starvation by the same "bottleneck" mechanism proposed to explain selective deacylation of only Val tRNAs during the triple starvation. This is another reason for questioning their "bottleneck" mechanism.

      We thank the reviewer for their deep assessment, exhaustive reading, and constructive feedback, which have greatly contributed to improving the clarity and contextualization of our manuscript. We would first like to clarify that all experiments in this study were conducted in NIH3T3 mouse fibroblasts, not HeLa cells; we assume this was a misunderstanding and have verified that the correct cell line is consistently indicated throughout the manuscript. We also clarify that our data show that -Leu, double starvation, and to a lesser extent -Ile, downregulate mTORC1 signaling and TOP mRNA translation, whereas valine -Val and triple starvation had minimal effects on these pathways. We agree that some of our conclusions and observed phenomena were not explained in sufficient detail in the original version. To address this, we have significantly reworked the discussion, added complementary figures and clarified key points throughout the text, to better convey the underlying rationale and biological interpretation of our findings. We address each of the reviewer’s points in detail in the point-by-point responses below.

      Specific comments (some of which were mentioned above):

      -The authors have treated cells with CHX in the Ribo-Seq experiments, which has been shown to cause artifacts in determining the locations of ribosome stalling in vivo owing to continued elongation in the presence of CHX (https://doi.org/10.1371/journal.pgen.1005732 ). The authors should comment on whether this artifact could be influencing some of their findings, particular the results in Fig. 5C where the hungry codons are often present upstream of the A sites of called stalling sites in the manner expected if elongation continued slowly following stalling in the presence of CHX.

      Response: We thank the reviewer for raising this important concern. We would like to clarify that our ribosome profiling protocol did not include CHX pretreatment of live cells. CHX was added only during the brief PBS washes immediately before lysis and in the lysis buffer itself. This approach aligns with best practices aimed at minimizing post-lysis ribosome run-off, and is intended to prevent the downstream ribosome displacement artifacts described by Hussmann et al. 2015, which result from pre-incubation of live cells with CHX for several minutes before harvesting. Furthermore, recent studies have demonstrated that CHX-induced biases are species-specific. For instance, Sharma et al. 2021 found that human (and mice) ribosomes are not susceptible to conformational restrictions by CHX, nor does CHX distort gene-level measurements of ribosome occupancy. This suggests that the use of CHX in the lysis buffer, as performed in our protocol, is unlikely to introduce significant artifacts in our ribosome profiling data. To further support this, we reanalyzed data from Darnell, Subramaniam, and O’Shea 2018, where the ribosome profiling samples were prepared without any CHX pretreatment or CHX in the wash buffer, and still observed similar upstream enrichments in their stalling profiles (see Supplementary Figure 2B-C in our manuscript). Additionally, in our previous work (Gobet et al. 2020), we compared ribosome dwell times with and without CHX in the lysis buffer and found no significant differences, reinforcing the notion that CHX use during lysis does not substantially affect the measurement of ribosome stalling. Given these considerations, we believe that CHX-related artifacts, such as downstream ribosome movement, are unlikely to explain the enrichment of hungry codons upstream of identified stalling sites in our data. We have now adjusted the Methods section to clarify this point.

      -p. 12: "These starvation-specific DT and ribosome density modulations were also evident at the individual transcript level, as exemplified by Col1a1, Col1a2, Aars, and Mki67 which showed persistent Val-codon-specific ribosome density increases but lost Ile-codon-specific increases under triple starvation (Supplementary Figure 3A-D). " This conclusion is hard to visualize for any but Val codons. It would help to annotate the relevant peaks of interest for -Ile starvation with arrows.

      Response: We agree and thank the reviewer for this observation. We have now annotated exemplary peaks in Supplementary Figure 3A–D to highlight ribosome pileups over Ile codons. However, we agree that it is still hard to visualize in the given Figure. Therefore, we added scatter plots for each of the transcripts that show the RPM of each position in the Ctrl vs starvation to allow for a better illustration of the milder effects upon Ile starvation (Supplementary Figure 4).

      -To better make the point that codon-specific stalling under BCAA starvation appears to be not driven by codon usage, rather than the analysis in Fig. 1H, wouldn't it be better to examine the correlation between increases in DT under the single amino acid starvation conditions and the codon frequencies across all codons?

      Response: We appreciate the suggestion. We have now added an additional analysis correlating the change in DT with codon usage frequency for each starvation condition. This is included in Supplementary Figure 5A-D and supports our interpretation that codon frequency alone does not explain the observed stalling behavior.

      -p. 13, entire paragraph beginning with "Our RNA-seq and Ribo-seq revealed a general activation of stress response pathways across all starvations..." It is difficult to glean any important conclusions from this lengthy analysis, and the results do not appear to be connected to the overall topic of the study. If there are important conclusions here that relate to the major findings then these connections should be made or noted later in the Discussion. If not, perhaps the analysis should be largely relegated to the Supplemental material.

      Response: We thank the reviewer for this comment. The paragraph in question is intended to provide a global overview of transcriptional and translational responses across the starvation conditions. It serves both as a quality control (e.g., PCA clustering and global shifts in RPF/RNA-seq profiles), and to confirm that expected starvation-induced responses are among the strongest detectable signals separating the starved samples from the control. Indeed, these observations establish that the perturbations are effective and that hallmark nutrient stress responses are globally engaged across conditions. Importantly, very few studies to date have examined transcriptional and translational responses under single or combined branched-chain amino acid (BCAA) starvation conditions. It therefore remains unclear to what extent BCAA depletion broadly remodels gene expression and translation. Our analysis contributes to addressing this gap, revealing that while certain stress pathways are commonly induced, others show condition-specific patterns such as we observed for -Ile starvation. To maintain focus, we have kept the detailed pathway analyses and transcript-level enrichments in the Supplement and rewritten the corresponding text in a more compact manner, reducing it by more than one third.

      -p. 15: "Together, these findings highlight that BCAA starvation triggers a combination of effects on initiation and elongation, with varying dynamics by amino acid starvation." I take issue with this statement as it appears that translation is reduced primarily at the initiation step for all conditions except -Ile. As noted above, these data are never menitioned in the DISCUSSION as to why only -Ile would show a marked elongation component to the inhibition whereas -Val gives the greatest amount of ribosome stalling.

      Response: We acknowledge the reviewer’s point. While the polysome profiles (Figure 3F-H) directly indicate that most conditions repress initiation, codon- and condition-specific elongation defects can still contribute to reduced protein output, even if they are not always detectable as global polysome shifts. Polysome profiles reflect the combined outcome of reduced initiation (which decreases polysome numbers) and ribosome stalling (which can, but does not always have to, increase ribosome density on individual transcripts, potentially counteracting the effects of reduced initiation). For valine starvation strong stalling occurs very early in the CDS (Figure 5F). This bottleneck restricts overall ribosome movement to downstream regions. Thus, while elongation is profoundly impaired, the total number of ribosomes per transcript (which polysome signals largely reflect) may appear low due to reduced overall ribosome traffic. In contrast, isoleucine codon stalling tends to occur also further downstream on the transcript (Figure 5F), allowing ribosomes to accumulate in larger numbers on the mRNA, leading to a clearer "elongation signature" in polysome profiles (Figure 3F, H). Additionally, we observed slightly higher inter-replicate variance for isoleucine starvation (Supplementary Figure 6B), which may have reduced the number of statistically significant stalling sites extracted compared to valine. We have revised the main text and discussion to clarify these points.

      -I cannot decipher Fig. 4D and more detail is required to indicate the identity of each column of data.

      Response: We thank the reviewer for pointing this out. Figure 4D (now Figure 4E) presents an UpSet plot, which is a scalable alternative to Venn diagrams commonly used to visualize intersections across multiple sets. Briefly, each bar in the upper plot represents the number of transcripts with increased 5′ ribosome coverage (Δpi < -0.15; p < 0.05) shared across the conditions indicated in the dot matrix below. Each column in the dot matrix highlights the specific combination of conditions contributing to a given intersection (e.g., dots under “Val” and “Triple” show the overlap between these two). To improve clarity, we have expanded the figure legend accordingly and now refer to the UpSetR methodology in the main text.

      -In Fig. 4E, one cannot determine what the P values actually are, which should be provided in the legend to confirm statistical significance.

      Response: Thank you for pointing that out. The legend in Figure 4E (now Figure 4F) for the p-values was accidentally removed during figure editing. We have added the legend back, so that the statistical significance is clear.

      -It's difficult to understand how the -Leu condition and the Double starvation can produce polarized RPFs (Fig. 4A) without evidence of stalling at the cognate hungry codons (Fig. 4E), despite showing later in Fig. 5A that the numbers of stall sites are comparable in those cases to that found for -Ile.

      Response: We appreciate this comment, which points to an important property of RPF profiles under nutrient stress. As shown in Figure 4A, all starvation conditions induce a degree of 5′ ribosome footprint polarization, a pattern that can be observed under various stress conditions and perturbations (Allen et al. 2021; Hwang and Buskirk 2017; Li et al. 2023). This general 5′ bias likely reflects a combination of slowed elongation and altered ribosome dynamics and is not necessarily linked to codon-specific stalling. However, Val and Triple starvation show a much stronger and more asymmetric polarization, characterized by pronounced 5′ accumulation and 3′ depletion of ribosome density. To better illustrate this, we have updated the visualization of polarity scores and added a new bar chart summarizing the number of transcripts showing strong 5′ polarization under each condition. This quantification highlights that the effect is markedly more prevalent under Val and Triple conditions than under Leu or Double starvation. In addition, Figure 4F demonstrates that this polarity is codon-specific under Val and Triple starvation. We clarify that this analysis tests for enrichment of specific codons near the start codon among the polarized transcripts and does not directly assess stalling. The observed enrichment of Val codons in the 5′ regions of polarized transcripts supports the interpretation that early elongation delays contribute to the RPF shift. In contrast, no such enrichment is observed for Leu starvation, reinforcing that Leu-induced polarity is not driven by stalling at Leu codons. While Figure 5 shows a similar number of peak-called stalling sites in -Leu, -Ile, and Double starvation, we note that Ribo-seq signal variability under Ile starvation was higher, which may have limited statistical power for detecting stalling sites, even though clear dwell time increases were observed at specific codons. Additionally, we have improved the metagene plots depicting total ribosome footprint density in Figure 4A. The previous version incorrectly showed sharp drops at CDS boundaries due to binning artifacts. The updated version more accurately reflects the density distribution and further highlights the stronger polarization in Val and Triple conditions. Together, these clarifications and improvements within the main text now more clearly distinguish between general polarity effects and codon-specific stalling.

      -Fig. 5B: the P values should be given for all five columns, and it should be explained here or in the Discussion why the authors conclude that stalling is an important determinant for reduced translation when a significant correlation seems to exist only for the -Val condition and not even for the Triple condition.

      Response: We thank the reviewer for this important observation. In response, we have revised both the text and the figures to provide a clearer and biologically more meaningful representation of the relationship between ribosome stalling and reduced protein output. Specifically, we have replaced the previous Figure 5B with a new analysis that stratifies transcripts based on the number of identified stalling sites. This updated analysis, now shown in Figure 5B, reveals that under Val and Triple starvation conditions, proteins that are downregulated tend to originate from transcripts with multiple stalling sites. Importantly, the corresponding p-values for all five conditions are now explicitly shown in the figure (as red lines). As the reviewer correctly notes, only the Val condition shows a statistically significant enrichment when considering overall overlap. Triple starvation shows a similarly high proportion of overlap (72.3%) but does not reach statistical significance, likely due to the more complex background composition under combined starvation, which increases the expected overlap and reduces statistical power. By stratifying transcripts by the number of stalling sites, we uncover that transcripts with ≥2 stalling sites are enriched among downregulated proteins specifically under Val and Triple conditions, providing a more robust indication of the link between stalling and translation repression under Valine deprivations. We believe this refined approach, prompted by the reviewer’s comment, offers a clearer and biologically more relevant perspective on the role of ribosome stalling. The original analysis previously shown in Figure 5B is now provided as Supplemental Figure 10C for transparency and comparison. We have clarified this in the revised text and now interpret the relationship more cautiously.

      -p. 17: "Of note, in cases where valine or isoleucine codons were present just upstream (rather than at) the stalling position, we noted a strong bias for GAG (E), GAA (E), GAU (D), GAC (D), AAG (K), CAG (Q), GUG (V) and GGA (G) (Val starvation) and AAC (N), GAC (D), CUG (L), GAG (E), GCC (A), CAG (Q), GAA (E) and AAG (K) (Ile starvation) at the stalling site (Supplementary Figure 7B)." The authors fail to explain why these codons would be present in the A sites at stalling sites rather than the hungry codons themselves, especially since it is the decoding times of the hungry codons that are increased according to Fig. 1A-E. As suggested above, is this a CHX artifact?

      Response: We agree that the observation that the listed codons are enriched at identified stalling positions (now Supplementary Figure 10C), while the depleted amino acid codon is located upstream, is a finding that needs more detailed explanation. Importantly, this phenomenon is not attributable to CHX artifacts, as our Ribo-seq protocol employs CHX solely during brief washes and lysis to prevent post-lysis ribosome run-off, rather than live-cell pre-treatment. Instead, we propose two hypotheses to explain this pattern: Firstly, many of these enriched codons are already inherently slow-decoded with longer DTs even under control conditions (Supplementary Figure 5H, newly added). Together with the upstream hungry codons they might form a challenging consecutive decoding environment, which results in an attenuated ribosome slowdown downstream after the hungry codon. Second, ribosome queuing may further explain this pattern. When a ribosome encounters a critically hungry codon and stalls, subsequent ribosomes can form a queue. The codon within the A-site of the queued ribosome would be (more or less) independent of the identity of the hungry codon itself that caused the initial stall. Since the listed codons have a high frequency within the transcriptome (Supp. Fig 5B), they therefore have an increased likelihood of appearing at this “stalling site”. Importantly, both of these phenomena are not necessarily represented by a general increase of DT on all of the listed codons and would therefore only be captured by the direct extraction of stalling sites but might be averaged out in the global dwell time analysis. We mention this phenomenon now in the Discussion.

      -Fig. 5D: P values for the significance, or lack thereof, of the different overlaps should be provided.

      Response: Thanks for pointing out this omission. We have now computed hypergeometric p-values for comparisons shown in Figure 5D and Figure 5E, and report them directly in the main text. As described, the overlap in stalling sites between Val and triple starvation is highly significant (2522 positions, p < 2.2×10⁻¹⁶), while overlaps involving Ile-specific stalling positions are smaller but still statistically robust (e.g., 149 positions for Ile – Triple, p = 1.77×10⁻⁵²). Notably, we also calculated p-values at the transcript level and found that a large fraction of transcripts with Ile-specific stalling under single starvation also stall under triple starvation, though often at different positions (1806 transcripts, p = 1.78×10⁻⁵⁸). These values are now included in the revised results section to support the interpretation of these overlaps.

      -p. 17: "Nonetheless, when we examined entire transcripts rather than single positions, many transcripts that exhibited isoleucine-related stalling under Ile starvation also stalled under triple starvation, but at different sites along the CDS (Figure 5E). This finding is particularly intriguing, as it suggests that while Ile-starvation-specific stalling sites may shift under triple starvation, the overall tendency of these transcripts to stall remains." The authors never come back to account for this unexpected result.

      Response: Thank you for highlighting this point. We've incorporated this finding as part of the proposed "bottleneck" scenario. While the isoleucine-specific stalling sites identified under Ile starvation do shift or disappear under triple starvation, we've observed that the same transcripts still tend to exhibit stalling. However, this now primarily occurs at upstream valine codons. We interpret this as a consequence of early elongation stalling caused by strong pausing at Val codons. This restriction on ribosome progression effectively prevents ribosomes from reaching the original Ile stalling sites. Therefore, the stalling sites identified under triple starvation are largely explained by the Val codons, reflecting a redistribution of stalling rather than its loss. To further clarify this crucial point, we've now explicitly mentioned Figure 5D-E again in the subsequent paragraph, which introduces the bottleneck theory.

      -It seems very difficult to reconcile the results in Fig. 5F with those in Fig. 4A, where similar polarities in RPFs are observed for -Ile and -Val in Fig, 4A but dramatically different distributions of stalling sites in Fig. 5F. More discussion of these discrepancies is required.

      Response: Thank you for pointing this out. The apparent discrepancy between the RPF profiles shown in Figure 4A and the stalling site distributions in Figure 5F likely reflects the fact that RPF polarization includes both general (unspecific) and codon-specific components. Figure 4A displays total ribosome footprint density, capturing both broad stress-induced effects and codon-specific contributions, whereas Figure 5F focuses specifically on peak-called stalling sites, representing localized and statistically significant pauses. Importantly, we would like to emphasise that Fig 4 shows that -Val and -Ile starvation exhibit different responses and not the same patterns. To make these differences even clearer, we have now updated the visualizations in Figure 4, including improved polarity plots and a new bar chart summarizing the number of transcripts with strong 5′ polarization. These additions highlight that the RPF profiles under -Val starvation are more pronounced and asymmetric, particularly due to 3′ depletion, while the polarity under -Ile is milder and a distinct, much smaller subset of transcripts appears to show polarity score shifts. We believe the updated figures and accompanying explanations now make these distinctions clearer.

      • p. 18: " These isoacceptor-specific patterns correlate largely with the particular subsets of leucine and isoleucine codons that stalled (Figure 1A)." This correlation needs to be addressed for each codon-anticodon pair for all of the codons showing stalling in Fig. 1A.

      Response: We thank the reviewer for this important comment. In the revised manuscript, we have expanded the relevant sections to address codon–anticodon relationships more thoroughly. We now explicitly match codons that exhibited increased dwell times under starvation to the corresponding tRNA isoacceptors whose charging was affected, and we provide a clearer discussion of the caveats involved. As noted by the reviewer, this correlation is not straightforward, as it is complicated by wobble base pairing, anticodon modifications, and the fact that multiple codons can be decoded by more than one isoacceptor, and vice versa. Moreover, in our qPCR-based tRNA charging assay, certain isoacceptors cannot be distinguished due to highly similar sequences (e.g., LeuAAG and LeuUAG, and LeuCAA and LeuCAG), which limits resolution for exact pairing. In addition, we did not assess absolute tRNA abundance, which may further influence decoding capacity. Nevertheless, where resolution is possible, the patterns align well: All tRNAVal isoacceptors became uncharged under Val and triple starvation, matching the consistent dwell time increases across all Val codons. Only tRNAIleAAU (decoding AUU and AUC) was deacylated, matching to these codons showing increased dwell times, while AUA (decoded by still-charged tRNAIleUAU) did not. Only CUU (decoded by uncharged tRNALeuGAA) showed increased dwell time. A mild deacylation of the other Leu isoacceptors was observed, but isoacceptor-level resolution is limited by assay constraints. However, these rather minimal tRNA and DT changes were consistent with more dominant initiation repression rather than elongation stalls. To support this analysis, we included an illustrative figure (now in Supplementary Figure 12F) summarizing the codon–anticodon matches.

      -p. 19: "For instance, in our double starvation condition, unchanged tRNA charging levels (Figure 6E) may result from a pronounced downregulation of global translation initiation, likely driven by the activation of stress responses (Figure 2), subsequently lowering the demand for charged tRNAs as it has been observed previously for Leu starvation 39.” This seems at odds with the comparable down-regulation of protein synthesis for the Double starvation and -Leu and -Ile single starvations shown in Fig. 3C. Also, in the current study, Leu starvation does lower charging of certain Leu tRNAs.

      Response: We thank the reviewer for raising this important point. In the revised manuscript, we have clarified this section and now offer a more refined interpretation of the tRNA charging patterns observed under double starvation. While Figure 3C shows a comparable reduction in global protein synthesis across the -Leu, -Ile, and double starvation conditions, it needs to be considered that the OPP assay has limited sensitivity. It operates in a relatively low fluorescence intensity range and is subject to background signal, which may obscure subtle differences between conditions. Moreover, other factors such as changes in protein stability or turnover could also contribute to the observed differences. Therefore, inter-condition differences in translation repression should be interpreted with caution. However, based on our stress response analysis (Figure 2), mTORC1 inactivation appears strongest under double starvation, likely leading to more profound suppression of translation initiation. This would reduce the overall demand for charged tRNAs and could explain why no detectable tRNA deacylation was observed under double starvation, even though mild uncharging of Leu isoacceptors occurred under -Leu, which exhibited a milder stress response. This distinction is consistent with the observed mild dwell time increases for one Leu codon under -Leu, but not in the double condition. Similarly, the absence of Ile codon stalling and tRNA deacylation under double starvation may be attributed to stress-driven reductions in elongation demand, preventing the tRNA depletion and codon-specific delays observed under single Ile starvation. A more direct clarification is now included in the revised manuscript.

      Reviewer #2 (Significance):

      The results here are significant in showing that starvation for a single amino acid does not lead to deacylation of all isoacceptors for that amino acid and in revealing that starvation for one amino acid can prevent deacylation of tRNAs for other amino acids, as shown most dramatically for the selective deacylation of only Val tRNAs in the triple BRCAA starvation condition. For the various reasons indicated above, however, I'm not convinced that their "bottleneck" mechanism is adequate to explain this phenomenon, especially in the case of the selective deacylation of Ile vs Leu tRNA in the Double starvation regime. It's also significant that deacylation leads to ribosome build-up near the 5'ends of CDS, which seems to be associated with an enrichment for the hungry codons in the case of Val and Ile starvation, but inexplicably, not for Leu or the Double starvations. This last discrepancy makes it hard to understand how the -Leu and Double starvations produce RPF buildups near the 5 ends of CDSs. In addition, the claim in the Discussion that "our data also highlight the importance of the codon positional context within mRNAs, indicating that where a codon is located within the CDS can influence both the extent of ribosomal stalling and overall translation efficiency during nutrient stress" overstates the strength of evidence that the stalling events lead to substantial decreases in translational efficiencies for the affected mRNAs, as the stalling frequency and decreased protein output are significantly correlated only for the -Val starvation, and the data in Fig. 3 D-H suggest that the reductions in protein synthesis generally occur at the level of initiation, even for -Val starvation, with a contribution from slow elongation only for -Ile-which is in itself difficult to understand considering that stalling frequencies are highest in -Val. Thus, while many of the results are very intriguing and will be of considerable interest to the translation field, it is my opinion that a number of results have been overinterpreted and that important inconsistencies and complexities have been overlooked in concluding that a significant component of the translational inhibition arises from the increased decoding times at hungry codons during elongation and that the selective deacylation of Val tRNAs in the Triple starvation can be explained by the "bottleneck" mechanism. The complexities and limitations of the data and their intepretations should be discussed much more thoroughly in the Discussion, which currently is devoted mostly to other phenomena often of tangential importance to the current findings. A suitably revised manuscript would clearly state the limitations and caveats of the proposed mechanisms and consider other possible explanations as well.

      Again, we thank the reviewer for the valuable insights and constructive critiques. We believe that the concerns regarding potential overinterpretation and inconsistencies have now been addressed through clearer explanations and more cautious interpretation throughout the revised manuscript. We also agree that the original Discussion included aspects that, while interesting, were of secondary importance. In light of the reviewer’s suggestions, we have restructured and rebalanced the Discussion to focus more directly on the key findings and their implications. Importantly, we wish to clarify that we do not propose the elongation bottleneck model as a general mechanism across all conditions. In particular, for double (Leu/Ile) starvation, we attribute the observed effects primarily to stress response–mediated translational repression, and not to codon-specific stalling or tRNA depletion. We believe that this distinction is now more clearly conveyed in the revised manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary

      Worpenberg and colleagues investigated the translational consequences of branched-chain amino acid (BCAA) starvation in mouse cells. Limitation of individual BCAAs has been reported to cause codon-specific and global translational repression. In this paper, the authors use RNA-seq, ribosome profiling (Ribo-seq), proteomics, and tRNA charging assays to characterize the impacts of individual and combined depletion of leucine, isoleucine, and valine on translation. They find that BCAA starvation increases codon-specific ribosome dwell times, activates global translational stress responses and reduces global protein synthesis. They infer that this effect is due to decreased translation initiation and codon-specific translational stalling. They find that the effects of simultaneous depletion are non-additive. In valine and triple (valine, leucine, and isoleucine) depletion, they show that affected transcripts have a high density of valine codons early in their coding sequences, creating an "elongation bottleneck" that obscures the impact of starvation of other amino acids. Finally, they identify isoacceptor-specific differences in tRNA charging that help explain the codon-specific effects that they observe.

      We find the major findings convincing and clear. We find that some results are incompletely explained. We suggest an additional experiment and also have some minor comments that we hope will improve clarity and rigor.

      We thank the reviewer for the thorough and constructive feedback. We appreciate the recognition of our main findings and the helpful suggestions for improving the manuscript. Below we address each point in detail.

      Major comments

      Figure 3O: In this figure and the associated text, the authors try to determine whether differences in protein degradation can explain why some proteins have higher ribosome density but lower proteomic expression. However, since this analysis relies on published protein half-lives from non-starvation conditions and on the assumption that protein synthesis has entirely stopped, we are not convinced it is informative for this experimental context. It does not distinguish between a model in which protein synthesis has been reduced by stalling and a model in which both protein synthesis and degradation rate have increased, which are both consistent with their Ribo-seq and proteomic data. To address this issue, the authors should either perform protein half-life measurements under their starvation conditions, or more clearly explain these two models in the text and acknowledge that they cannot distinguish between them.

      Response: We agree with the reviewer that our current analysis, which is based on protein half-lives obtained under non-starvation conditions, can not definitively separate the effects of reduced translation from those of increased protein degradation. We have revised the relevant section in the manuscript to more clearly state that this analysis is correlative in nature and serves only to explore one possible explanation for the observed disconnect between ribosome density and protein levels. We now also explicitly acknowledge that our dataset does not allow us to distinguish between a model in which protein output is reduced due to stalling and one in which both translation and degradation rates are altered. However, the observed log2FC in the proteomics data are often milder than expected based on complete-medium condition half-life alone, which would be difficult to reconcile with a dominant contribution from global protein destabilization. That said, we also acknowledge that protein degradation is highly context- and protein-specific, and that proteolytic regulation might still play a role. Performing a direct protein half-life measurement under our starvation conditions would indeed be required to rigorously test this, but such an experiment is outside the scope of this study. We now highlight this as a limitation and a valuable direction for future work, and we have softened any interpretations in the main text to reflect the uncertainty regarding the contribution of protein stability changes.

      Minor comments

      Figure 1G: Why does intracellular valine seem to be less depleted under starvation conditions than intracellular leucine or isoleucine? Are the limits of detections different for different amino acids? The authors should acknowledge this discrepancy and comment on whether it has any implications for interpretation of their results.

      Response: We thank the reviewer for this important point. While valine appears slightly less depleted than leucine or isoleucine in Figure 1G, the fold changes and absolute reductions are strong for all three BCAAs, including valine. To further illustrate this, we have added a supplementary bar chart showing the measured intracellular concentrations in µmol/L, including mean and variance across five biological replicates (Supplementary Figure 5A). We believe that the variation may reflect technical factors, such as differences in detection sensitivity or ionization efficiency between amino acids in the targeted metabolomics assay and, therefore, that the observed difference does not have a meaningful impact on the interpretation of our results. We now directly acknowledge these differences in the main text.

      Figure 1H: These data do not appear to meet the assumptions for linear regression. We suggest either reporting a Spearman R correlation (as the data appears linear in rank but not absolute value), or remove it entirely - we think the plot without statistics is sufficient.

      Response: We thank the reviewer for the suggestion. In the revised manuscript, we removed the statistical annotation and retained only the trend line to illustrate the general pattern. We agree that this visualization alone is sufficient to support the qualitative point we aimed to convey.

      Figure 2B: The in-text description of this figure states that "most" ISR genes show a "robust induction," but only three genes are shown in the figure, two of which are upregulated. The authors should instead specify that 2 out of the 3 genes profiled were robustly induced.

      Response: We have rephrased the sentence to say “two of the three genes profiled…” for precision and consistency with the data shown.

      Figure 2D: Please include the full, uncropped blots in the supplementary materials.

      Response: We have now added the full, uncropped western blots to the supplementary material (Supplementary Figure 8).

      Figure 2E: Swap the positions of the RPS6 and 4E-BP1 plots so they line up with their respective blots to make these figures easier to interpret. Authors should consider doing a one-way ANOVA and post-hoc analysis, if we correctly understand that they are making a conclusion about the difference between multiple groups in aggregate.

      Response: We thank the reviewer for the suggestion. The alignment of the RPS6 and 4E-BP1 plots with their respective blots has been corrected. As this panel focuses on comparisons to the control condition only, we have retained the original presentation.

      Figure 4B: Panel A in this figure is very convincing, and these plots don't add additional information. The authors could consider removing them. If this panel stays in, we suggest removing the "mid index" plot, since it is never referenced in the text and doesn't seem relevant to the message of the figure.

      Response: We appreciate the feedback. While we considered removing panel B as suggested, we decided to retain it because it provides a useful summary of panel A. To improve clarity and visual interpretation, we replaced the original boxplot with a bar plot displaying mean values and SEM error bars. We believe the bar plot now nicely illustrates that Val and Triple starvation lead to stronger effects, especially in the reduction of the 3′ index. The “mid index” plot, which was not referenced in the text and did not contribute to the central message, has been removed as suggested.

      Figure 4E: Why is there a reduction in frequency of a Leu and a Val codon under Ile starvation?

      Response: Thank you for highlighting this observation. The reduction in the frequency of a specific Leu and Val codon under Ile starvation in Figure 4F (former Figure 4E) is indeed intriguing. This figure reflects codon usage in the first 20% of the CDSs among the subset of transcripts that exhibit a footprint polarization under each starvation condition. As such, the observed depletion likely arises from the specific transcript composition of the polarized subset under -Ile, which differs from that under -Val or other conditions. Importantly, this pattern is not consistently observed when analyzing the full transcripts (another Leu codon is affected), indicating that it is not a systematic depletion of these codons. One possibility is that an increased frequency of Ile codons (AUC) within the constrained region may lead to a relative underrepresentation of other codons, such as Leu and Val. Alternatively, this may reflect non-random codon co-occurrence patterns within specific transcripts. While our current data do not allow us to investigate this further, we acknowledge these as speculative explanations and now mention this point in the Discussion as a potential avenue for future study.

      Figure 5G: There appears to be one Val codon early in the Hint1 transcript without much stalling under triple or valine starvation conditions. The authors should acknowledge this and comment on why this may be.

      Response: We thank the reviewer for pointing this out. While the Hint1 transcript indeed contains a valine codon early in its CDS, no clear stalling peak was observed at that position under valine or triple starvation. Several factors may contribute to this: local sequence context can influence ribosome pausing, and not all cognate codons necessarily lead to detectable stalling even under amino acid starvation. Additionally, coverage at the 5′ end of Hint1 is relatively sparse in our dataset, and potential mappability limitations, such as regions with low complexity or repetitive elements, may further reduce resolution at specific sites. We now briefly mention this in the manuscript to clarify the possible causes.

      Figure 5B: In the text referencing this figure, the authors state that "a high number of downregulated proteins with associated ribosome stalling sites did not show an overall decreased mean RPF count...as it would be expected from translation initiation defects, linking these stalling sites directly to proteomic changes." However, RPF is affected both by stalling (increases RPF) and initiation defects (decreases RPF). A gene with both stalling and decreased initiation may appear to have no RPF change. The data does suggest a contribution from stalling, but the authors should also acknowledge that reduced initiation may also be playing a role.

      Response: We agree with the reviewer comment. Our cited statement should indeed be more nuanced. The reviewer correctly points out that RPFs are influenced by both increased ribosome density due to stalling and decreased ribosome density due to reduced initiation. Therefore, a gene experiencing both stalling and reduced initiation might appear to have no net change in RPF, or even a slight increase if stalling is dominant. Thus, while the presence of stalling sites strongly suggests a contribution from compromised elongation to reduced protein output, we cannot definitively rule out a concurrent role for reduced initiation, even in cases where RPF counts are not globally decreased. We revised this section in the manuscript to acknowledge this interplay.

      Figure 5E: the black text on dark brown in the center of the Venn diagram is difficult to read. The diagram should either have a different color scheme, or the text in the center should be white instead of black for higher contrast.

      Response: We have adjusted the text color for better contrast and improved readability.

      Supplementary Figure 1C: The ribosome dwell time data in this study is described as "highly correlated" with another published dwell time dataset, but the P and E site data do not seem strongly correlated. The authors should remove the word "highly."

      Response: We have removed the word “highly” to have a more cautious interpretation in the text.

      Supplementary Figure 3E: Not all of the highlighted codons in this figure are ones with prolonged dwell times. To clarify the point that dwell time change is not related to codon frequency, this figure should only highlight codons that have a significantly prolonged dwell time in at least one starvation condition.

      Response: We thank the reviewer for pointing this out. To improve clarity, we have revised the figure and now specifically highlight codons with significantly prolonged dwell times with stars.

      Supplementary Figure 5C: The gene Chop is mentioned in the main text when referencing this figure, but is absent from the heatmap.

      Response: We thank the reviewer for noting this. The gene Chop is annotated under its alternative name Ddit3 in the current version of the heatmap and is indeed present. To avoid confusion, we have now updated the label in the figure to display Chop (Ddit3) directly.

      Supplementary Figure 7A: The authors could clarify this figure by adding additional language to either the figure panel or the figure legend specifying that the RPM metric being used comes from Ribo-seq.

      Response: We have updated the legend to explicitly state that the RPM values shown are derived from Ribo-seq data.

      Supplementary Figure 7D: The metric used to describe the spatial relationship between the first valine and isoleucine codons in transcripts in this figure seems to be describing something conceptually similar to the stalling sites in Figure 5G, but uses a different metric. These figures would be easier to interpret if these spatial relationships were presented in a consistent way throughout the manuscript.

      Response: We thank the reviewer for this helpful observation. Supplementary Figure 7D (now Supplementary Figure 11B) originally used a gene-length-normalized metric to describe codon spacing, whereas Figure 5G depicted absolute nucleotide distances to stalling sites. To ensure consistency across the manuscript, we have now updated Supplementary Figure 11B to also use absolute distances. We believe this adjustment improves clarity and allows for a more direct comparison between spatial codon patterns and stalling events.

      Discussion:

      Reader understanding would be improved if the relevance of paragraphs were established in the first sentence. For instance, in the paragraphs about adaptive misacylation and posttranscriptional modifications, it is unclear until the end of the paragraph how these topics are relevant. Introducing the relevant aspects of the study (the fact that some starvation conditions have less severe effects and the observation about m6A-related mRNAs) at the beginning of these paragraphs would improve clarity.

      Response: We thank the reviewer for this helpful comment. We agree that the flow and clarity of the Discussion can be improved by making the relevance of each paragraph clearer from the outset. In the revised manuscript, we have restructured these sections to better highlight the connection between each topic and our main findings. These changes also align with suggestions from Reviewer 2, and we believe they help to focus the Discussion more tightly around the core insights of our study.

      The authors should provide more information and speculation about possible physiological relevance of their findings, particularly about the way that the effects of triple starvation are highly valine-dependent. Are there physiological conditions under which starvation of all three BCAAs is more likely than starvation of one or two of them? If so, are there any reasons why a valine-based bottleneck might be advantageous?

      Response: We appreciate the reviewer's insightful question regarding the physiological relevance of our findings, particularly the valine-dependent bottleneck observed under triple BCAA starvation. This prompts a crucial discussion on the broader biological context of our work.

      While complete starvation of all three BCAAs might be less frequent than individual deficiencies, such conditions are physiologically relevant in several contexts. In prolonged fasting, starvation, or severe cachectic states associated with chronic diseases (e.g., advanced cancer, critical illness), systemic amino acid pools, including BCAAs, can become significantly depleted due to increased catabolism and insufficient intake (Yu et al. 2021). Moreover, certain specialized diets or therapeutic strategies aim to modulate BCAA levels. For instance, in some Maple Syrup Urine Disease (MSUD) management protocols, BCAA intake is severely restricted to prevent the accumulation of toxic BCAA metabolites (Mann et al. 2021). Similarly, emerging cancer therapies sometimes explore nutrient deprivation strategies to selectively target tumor cells, which could involve broad BCAA reduction (e.g. Sheen et al. 2011; Xiao et al. 2016).

      In these contexts, a valine-based bottleneck, as we describe, could indeed represent an adaptive strategy. If valine-tRNAs are particularly susceptible to deacylation and valine codons are strategically enriched at the 5' end of transcripts, stalling at these early positions could serve as a rapid "gatekeeper" for global translation. This early-stage inhibition would conserve cellular energy and available amino acids by quickly reducing the overall demand for charged tRNAs. Such a mechanism could potentially prioritize the translation of a subset of proteins that might have different codon usage biases or are translated via alternative, less valine-dependent mechanisms. This aligns with the concept of a multi-layered translational control where global initiation repression (as reflected in mTORC1 inhibition and polysome profiles) is complemented by specific elongation checkpoints, allowing for a more nuanced and adaptive response to severe nutrient stress.

      Reviewer #3 (Significance):

      Nature and significance of the advance

      The main contribution of this work is to demonstrate that depletion of multiple amino acids simultaneously impacts translation elongation in ways that are not necessarily additive. These impacts can depend on the distribution of codons in a transcript. It adds to a growing body of work showing that essential amino acid starvation can cause codon-specific ribosome stalling. The authors suggest that the position-dependent stalling they observe could be a novel regulatory mechanism to alleviate the effects of multi-amino acid starvation. However, it is not fully clear from the paper what the significance of a valine-based regulatory adaptation to BCAA starvation is, or whether simultaneous starvation of all three BCAAs is of particular physiological relevance. The paper's primary contribution is mainly focused on the similarity between valine and triple BCAA starvation, and it provides limited insight into the effects of combined depletion of two BCAAs.

      Context of existing literature

      Although ribosome profiling does not distinguish between actively-elongating and stalled ribosomes, sites with higher read coverage, and thereby higher inferred dwell time, can be used to infer ribosome stalling (Ingolia 2011). Various downstream effects of essential amino acid depletion have been documented, such as leucine deficiency being sensed by mTORC1 via leucyl-tRNA synthetase (Dittmar 2005, Han 2012), and shared transcriptional responses among many amino acid depletion conditions (Tang 2015). These authors have previously measured the translational effects of nutrient stress using ribosome profiling (e.g., Gobet 2020), as have others (Darnell 2018, Kochavi et al. 2024). The present work represents the first study (to our knowledge) combining BCAA depletions, representing an incremental and useful contribution to our understanding of translational responses to stress conditions.

      Audience

      This work is of interest to investigators studying the response of human cells in stress conditions, such as in human disease, as well as investigators studying the basic biology of eukaryotic translational control.

      Reviewer expertise: mRNA decay and translation regulation in bacteria.

      We hope the authors have found our comments thoughtful and useful. We welcome further discussion or clarification via email: Juliana Stanley (julianst@mit.edu) and Hannah LeBlanc (leblanch@mit.edu).

      We sincerely thank the reviewers for their thoughtful and constructive feedback, as well as for their careful and thorough reading of our manuscript. We also gratefully acknowledge the invitation for further discussion and would be happy to engage in future correspondence.

      References

      Allen, George E., Olesya O. Panasenko, Zoltan Villanyi, Marina Zagatti, Benjamin Weiss, Lucile Pagliazzo, Susanne Huch, et al. 2021. “Not4 and Not5 Modulate Translation Elongation by Rps7A Ubiquitination, Rli1 Moonlighting, and Condensates That Exclude eIF5A.” Cell Reports 36 (9): 109633. https://doi.org/10.1016/j.celrep.2021.109633.

      Darnell, Alicia M., Arvind R. Subramaniam, and Erin K. O’Shea. 2018. “Translational Control through Differential Ribosome Pausing during Amino Acid Limitation in Mammalian Cells.” Molecular Cell 71 (2): 229-243.e11. https://doi.org/10.1016/j.molcel.2018.06.041.

      Dittmar, Kimberly A., Michael A. Sørensen, Johan Elf, Måns Ehrenberg, and Tao Pan. 2005. “Selective Charging of tRNA Isoacceptors Induced by Amino-Acid Starvation.” EMBO Reports 6 (2): 151–57. https://doi.org/10.1038/sj.embor.7400341.

      Elf, Johan, Daniel Nilsson, Tanel Tenson, and Mans Ehrenberg. 2003. “Selective Charging of tRNA Isoacceptors Explains Patterns of Codon Usage.” Science (New York, N.Y.) 300 (5626): 1718–22. https://doi.org/10.1126/science.1083811.

      Gobet, Cédric, Benjamin Dieter Weger, Julien Marquis, Eva Martin, Nagammal Neelagandan, Frédéric Gachon, and Felix Naef. 2020. “Robust Landscapes of Ribosome Dwell Times and Aminoacyl-tRNAs in Response to Nutrient Stress in Liver.” Proceedings of the National Academy of Sciences of the United States of America 117 (17): 9630–41. https://doi.org/10.1073/pnas.1918145117.

      Hussmann, Jeffrey A., Stephanie Patchett, Arlen Johnson, Sara Sawyer, and William H. Press. 2015. “Understanding Biases in Ribosome Profiling Experiments Reveals Signatures of Translation Dynamics in Yeast.” Edited by Michael Snyder. PLOS Genetics 11 (12): e1005732. https://doi.org/10.1371/journal.pgen.1005732.

      Hwang, Jae-Yeon, and Allen R. Buskirk. 2017. “A Ribosome Profiling Study of mRNA Cleavage by the Endonuclease RelE.” Nucleic Acids Research 45 (1): 327–36. https://doi.org/10.1093/nar/gkw944.

      Juszkiewicz, Szymon, and Ramanujan S. Hegde. 2017. “Initiation of Quality Control during Poly(A) Translation Requires Site-Specific Ribosome Ubiquitination.” Molecular Cell 65 (4): 743-750.e4. https://doi.org/10.1016/j.molcel.2016.11.039.

      Li, Fajin, Jianhuo Fang, Yifan Yu, Sijia Hao, Qin Zou, Qinglin Zeng, and Xuerui Yang. 2023. “Reanalysis of Ribosome Profiling Datasets Reveals a Function of Rocaglamide A in Perturbing the Dynamics of Translation Elongation via eIF4A.” Nature Communications 14 (1): 553. https://doi.org/10.1038/s41467-023-36290-w.

      Mann, Gagandeep, Stephen Mora, Glory Madu, and Olasunkanmi A. J. Adegoke. 2021. “Branched-Chain Amino Acids: Catabolism in Skeletal Muscle and Implications for Muscle and Whole-Body Metabolism.” Frontiers in Physiology 12 (July):702826. https://doi.org/10.3389/fphys.2021.702826.

      Saikia, Mridusmita, Xiaoyun Wang, Yuanhui Mao, Ji Wan, Tao Pan, and Shu-Bing Qian. 2016. “Codon Optimality Controls Differential mRNA Translation during Amino Acid Starvation.” RNA (New York, N.Y.) 22 (11): 1719–27. https://doi.org/10.1261/rna.058180.116.

      Sharma, Puneet, Jie Wu, Benedikt S. Nilges, and Sebastian A. Leidel. 2021. “Humans and Other Commonly Used Model Organisms Are Resistant to Cycloheximide-Mediated Biases in Ribosome Profiling Experiments.” Nature Communications 12 (1): 5094. https://doi.org/10.1038/s41467-021-25411-y.

      Sheen, Joon-Ho, Roberto Zoncu, Dohoon Kim, and David M. Sabatini. 2011. “Defective Regulation of Autophagy upon Leucine Deprivation Reveals a Targetable Liability of Human Melanoma Cells In Vitro and In Vivo.” Cancer Cell 19 (5): 613–28. https://doi.org/10.1016/j.ccr.2011.03.012.

      Xiao, Fei, Chunxia Wang, Hongkun Yin, Junjie Yu, Shanghai Chen, Jing Fang, and Feifan Guo. 2016. “Leucine Deprivation Inhibits Proliferation and Induces Apoptosis of Human Breast Cancer Cells via Fatty Acid Synthase.” Oncotarget 7 (39): 63679–89. https://doi.org/10.18632/oncotarget.11626.

      Yu, Deyang, Nicole E. Richardson, Cara L. Green, Alexandra B. Spicer, Michaela E. Murphy, Victoria Flores, Cholsoon Jang, et al. 2021. “The Adverse Metabolic Effects of Branched-Chain Amino Acids Are Mediated by Isoleucine and Valine.” Cell Metabolism 33 (5): 905-922.e6. https://doi.org/10.1016/j.cmet.2021.03.025.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript described the translational responses to single and combined BCAA shortages in mouse cell lines. Using Ribo-seq and RNA-seq analysis, the authors found selective ribosome pausing at codons that encode the depleted amino acids, where the pausing at valine codons was prominent at both a single and triple starvations whereas isoleucine codons showed pausing only under a single depletion. They analyzed the mechanisms of the unexpected selective pausing and proposed that the positional codon usage bias could shape the ribosome stalling and tRNA charging patterns across different amino acids. They also examined the stress responses and the changes in the protein expression levels under BCAA starvation.

      The manuscript was well-written, and the findings are interesting, especially their model that positional codon usage bias could be a regulator of ribosome pausing and tRNA charging levels. Although different translational responses to distinct amino acid starvation have been widely documented, the positional codon usage bias is an interesting aspect. The manuscript's central message could have been made clearer. The authors may consider emphasizing this point more explicitly in the abstract. The rich multi-omics dataset in this work provides valuable resources for the translation field.

      Major comments

      1. The abstract may need to be revised since it is hard to immediately catch the authors' main point. If the authors regard this work as a resource paper, the current version is fine. But it could be better to point out the positional codon usages the authors found, which is a strong point of the current manuscript.
      2. Page 18 "Beyond these tRNA dynamics, our data also highlight the importance of the codon positional context within mRNAs, indicating that where a codon is located within the CDS can influence both the extent of ribosomal stalling and overall translation efficiency during nutrient stress."<br /> This idea is interesting. To what extent the authors think this could be generalized? The authors may discuss whether they think their proposed model is specific to the different ribosome stalling patterns between valine and isoleucine codons or generalized to other codon combinations. For example, the positional codon usage bias will be different among different organisms, and are there any previous reports on ribosome behaviors that align with their model? Even if the authors think this model can be applied to BCAA starvation, would it be possible to explain the different isoleucine codon responses between single and double starvation? The authors may discuss why the ribosome stalling at isoleucine AUU and AUC codons was slightly attenuated under double starvation. And how about the different leucine codon responses among single, double, and triple starvations, although the pausing is not as strong as isoleucine and valine codons? Experimental validation using artificial reporters carrying biased sequences may also be considered.
      3. Page 13 "Moreover, we noticed that DT changes extend beyond the ribosomal A-site, including the P-site, E-site, and even further positions (Supplementary Fig. 2A), consistent with other studies on single amino acid starvation 39 (Supplementary Fig. 2B-C)." Could the widespread DT changes be due to Ribo-DT pipeline they used or difficulties in offset determination? Indeed the authors showed that this feature was found in other datasets, but it seems that the datasets were processed and analyzed in the same way as their data. The original Ribo-DT paper (Gobet and Naef, 2022, Methods) also showed some widespread DT changes even from RNA-seq. Another analysis method like the codon subsequence abundant shift as a part of diricore analysis (Loayza-Puch et al., 2016, Nature) did not show that broad changed regions. The authors are encouraged to re-analyze the data sets using different methods.
      4. Page 13 "Intriguingly, only two of the three isoleucine codons (AUU and AUC) showed increased DTs upon Ile starvation (p < 0.01), while just one leucine codon (CUU) exhibited a modest but significant DT increase (p < 0.01) under Leu starvation (Figure 1A-B, Supplementary Figure 2A)." How can the authors explain the different strengths of ribosome pausing at Ile codons under Ile and double starvation? The AUA codon did not show any pausing under either of the starvation conditions. Throughout the manuscript, the authors mainly describe the difference between amino acids but it is desirable to discuss the codon-level difference as well.
      5. Page 13 "We examined the effects of single amino acid starvations (-Leu, -Ile and -Val), as well as combinations, including a double starvation of leucine and isoleucine (hereafter referred to as "double") and a starvation of leucine, isoleucine, and valine ("triple"), allowing us to identify potential non-additive effects." The different double starvations, isoleucine and valine, and leucine and valiene, will further support their hypothesis on the effects of the positional codon usage bias on ribosome pausing and tRNA charging patterns. Although this could be beyond the scope of the current manuscript, the authors are encouraged to provide a rationale for the chosen combination.

      Minor comments

      Page 16 "these results imply that BCAA deprivation lowers protein output through multiple pathways: a combination of reduced initiation, direct elongation blocks (stalling), and possibly an increased proteolysis" This conclusion is totally right but may be too general. Could the authors summarize BCAA-specific features of the events including reduced initiation, stalling, and proteolysis that all contribute to protein outputs? This is not well discussed in the latter sections including Discussion.

      Significance

      The manuscript was well-written, and the findings are interesting, especially their model that positional codon usage bias could be a regulator of ribosome pausing and tRNA charging levels. Although different translational responses to distinct amino acid starvation have been widely documented, the positional codon usage bias is an interesting aspect. The manuscript's central message could have been made clearer. The authors may consider emphasizing this point more explicitly in the abstract. The rich multi-omics dataset in this work provides valuable resources for the translation field.

    1. row_index = list(range(rows))[0:rows-1:2] col_index = list(range(cols))[0:cols-1:2]

      range(rows)生成一个 从 0 到 rows-1 的整数序列。 例:range(6) -> 0,1,2,3,4,5,6 list(range(rows))把 range 对象转成真正的 列表。 list(range(6)) → [0, 1, 2, 3, 4, 5]

    1. Reviewer #2 (Public review):

      In this study, Xiong et al. investigate whether rhythmic sampling - a process typically observed in the attended processing of visual stimuli - extends to task-irrelevant distractors. By using EEG with frequency tagging and multivariate pattern analysis (MVPA), they aimed to characterize the temporal dynamics of both target and distractor processing and examine whether these processes oscillate in time. The central hypothesis is that target and distractor processing occur rhythmically, and the phase relationship between these rhythms correlates with behavioral performance.

      Major Strengths<br /> (1) The extension of rhythmic attentional sampling to include distractors is a novel and interesting question.<br /> (2) The decoding of emotional distractor content using MVPA from SSVEP signals is an elegant solution to the problem of assessing distractor engagement in the absence of direct behavioral measures.<br /> (3) The finding that relative phase (between 1 Hz target and distractor processes) predicts behavioral performance is compelling.

      Major Weaknesses and Limitations<br /> (1) The central claim of 1 Hz rhythmic sampling is insufficiently validated. The windowing procedure (0.5s windows with 0.25s step) inherently restricts frequency resolution, potentially biasing toward low-frequency components like 1 Hz. Testing different window durations or providing controls would significantly strengthen this claim.<br /> (2) The study lacks a baseline or control condition without distractors. This makes it difficult to determine whether the distractor-related decoding signals or the 1 Hz effect reflect genuine distractor processing or more general task dynamics.<br /> (3) The pairwise decoding accuracies for distractor categories hover close to chance (~55%), raising concerns about robustness. While statistically above chance, the small effect sizes need careful interpretation, particularly when linked to behavior.<br /> (4) Neither target nor distractor signal strength (SSVEP amplitude) correlates with behavioral accuracy. The study instead relies heavily on relative phase, which-while interesting-may benefit from additional converging evidence.<br /> (5) Phase analysis is performed between different types of signals hindering their interpretability (time-resolved SSVEP amplitude and time-resolved decoding accuracy).

      The authors largely achieved their stated goal of assessing rhythmic sampling of distractors. However, the conclusions drawn - particularly regarding the presence of 1 Hz rhythmicity - rest on analytical choices that should be scrutinized further. While the observed phase-performance relationship is interesting and potentially impactful, the lack of stronger and convergent evidence on the frequency component itself reduces confidence in the broader conclusions.

      If validated, the findings will advance our understanding of attentional dynamics and competition in complex visual environments. Demonstrating that ignored distractors can be rhythmically sampled at similar frequencies to targets has implications for models of attention and cognitive control. However, the methodological limitations currently constrain the paper's impact.

      Additional Considerations<br /> • The use of EEG-fMRI is mentioned but not leveraged. If BOLD data were collected, even exploratory fMRI analyses (e.g., distractor modulation in visual cortex) could provide valuable converging evidence.<br /> • In turn, removal of fMRI artifacts might introduce biases or alter the data. For instance, the authors might consider investigating potential fMRI artifact harmonics around 1 Hz to address concerns regarding induced spectral components.

      Comments on revisions:

      The authors have addressed my previous points, and the manuscript is substantially improved. The key methodological clarifications have been incorporated, and the interpretation of findings has been appropriately moderated. I have no further major concerns.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Prior to the point-by-point response to the reviewer, we would like to sincerely thank all the peer reviewers for their overwhelmingly positive comments and helpful suggestions. The recommendations have undoubtedly improved our initial submission, and we have done our best to incorporate as many of the suggestions as possible.

      Reviewer #1* (Evidence, reproducibility and clarity (Required)): *

      *Jones et al. have submitted a manuscript detailing the role of Coenzyme A in the regulation of macrophage polarization. Overall, the manuscript is well designed, and the conclusions are well supported by the data. I find no major or minor deficiencies that need to be corrected. *

      * Reviewer #1 (Significance (Required)): *

      For decades the immunology community has boldly stated that mitochondrial metabolism not only provides the bioenergetics for cell expansion but also dictates cell fate. This has been especially true for fatty acid beta oxidation. Macrophage, T-cell and B-cell polarization have all been shown to require FAO for their polarization, but all based on one inhibitor. NONE of these observations hold up with more rigorous experimentation. The Divakaruni group has previously suggested that intracellular CoA homeostasis was the driver of macrophage differentiation as they could reverse the inhibitory effects by providing heroic levels of CoA extracellularly. Here, they have clarified the role of CoA. Intracellular CoA does not affect macrophage polarization/differentiation. This was done with cleaver manipulation of the CoA pools. Rather, extracellular CoA can act as a weak TLR4 ligand. This work nicely clarifies their previous work and further demonstrates a role for this metabolite as an endogenous activator of type 1 macrophages.

      We are thrilled by the positive comments about our work, and we are grateful the reviewer found our submission to be clarifying for the field and significant in the larger context of immunometabolism research.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *This is a fairly straightforward manuscript that indicated CoA acts as a "weak" TLR4 agonist and primes macrophages for alternative activation. Overall, the experiments are well done and clear enough. There are two major issues that need to be addressed: *

      We thank the reviewer for their positive comments regarding the quality and clarity of our work.

      1. *Previous work has shown the following pathway: LPS>IL10>STAT3>IL4Ra>>>increased responsiveness to IL4/IL13 and increased expression of M2 associated markers (please note, this pathway does not apply to Arg1, often erroneously associated with M2 macrophages - LPS induces Arg1 far more than IL4 and this is independent of the STAT6 pathway - Dichtl et al., Science Advances and El Kasmi et al. Nature Immunology, and others). This pathway was first described in Lang et al. 2002 J. Immunol. Subsequently, other groups showed IL6 (Jens Brüning) and OSM (Carl Richards) do the same thing, which is not surprising given that they are STAT3 activators. Thus, Il4ra is a STAT3 target gene; this also makes sense in the kinetic evolution of macrophages from inflammatory to tissue reparative (if they survive). In my view, the authors have most likely found the same pathway. In Jones, expression of the IL4Ra was not quantified. Thus, the pathway described above needs to be accounted for. It may not apply here but seems the easiest explanation of the data. *

      This is an excellent and important experiment suggested by the reviewer, and we address this in our revised Supplemental Figure 5. To determine whether the effect of CoA can be explained simply by a STAT3-mediated effect on the IL-4 receptor, we treated cells with the well-characterized STAT3 inhibitor Napabucasin and measured whether CoA could enhance the macrophage IL-4 response. Two results are clear from the data:

      • Treatment with Napabucasin reduced the expression of IL-4-linked cell surface markers and the IL-4 target gene Ccl8. This serves as an important control consistent with the Il4ra gene being a STAT3 target that increases IL-4 responsiveness.
      • Despite STAT3 inhibition and a reduced IL-4 response, CoA provision still augmented the IL-4-induced expression of Ccl8 and the percentage of CD206+/CD301+ cells, indicating a STAT3-independent mechanism. The result aligns with our ATAC-Seq data in Figure 6 that shows broad changes in chromatin accessibility that cannot be completely explained by expression-level changes in the IL-4 receptor.

      *Can the authors come up with a meaningful in vivo experiment to corroborate their data. Pantothenate-deficient mice have many phenotypes (not fully explored at all - PMID 31918006, for example) and pantothenate metabolism can be manipulated in different ways. Obviously, a complex in vivo experiment is not feasible here. But this should be discussed. What happens in human macrophages, where "polarization" is a completely different beast? *

      We thank the reviewer for these thoughtful comments, and address the questions regarding in vivo proof-of-concept and polarization of human macrophages separately:

      • Regarding the question of whether CoA can enhance the phenotype of IL-4-activated human macrophages, this is an excellent suggestion and we have added the data as Figure 1h. Indeed, Coenzyme A dramatically amplifies expression of the human IL-4 responsive genes CCL17, TGM2, and PDCD1LG2 (similarly to mouse macrophages). The result substantially expands the significance of our work by showing the phenotype is reproducible in both mouse and human macrophages – unlike many immunometabolic phenotypes – and we thank the reviewer again for suggesting this experiment.
      • With respect to an in vivo experiment to corroborate our data, we entirely agree with the reviewer regarding both the importance, but also the difficulty in interpretation, of an experiment genetically manipulating CoA synthesis in vivo. As they have suggested, we raise these issues in the discussion on Lines 370-377 of the revised manuscript. Here, we note the following points:
      • Wherever possible/appropriate (e.g. Figures 1g, 3f&g, 5g&h), we have sought to corroborate our in vitro findings with in vivo/ex vivo proofs-of-concept.
      • Studying immune phenotypes in pantothenate-deficient mice would be an exciting experiment in principle, but difficult to interpret if conducted. As noted by the reviewer in the work from Drs. Rock and Jackowski, knockout of one of four isoforms of pantothenate kinase (PANK) shows mild phenotypes consistent with compensation across isoforms for CoA provision. Global double knockout of PANK1 and PANK2, however, is postnatally lethal. Regardless, a tissue-specific double knockout in myeloid cells is unlikely to show a phenotype given our results showing that manipulating intracellular CoA levels in BMDMs does not alter the IL-4 response (Figs. 2h-j).
      • Given the established role of CoA in postnatal development, it would be difficult to attribute any immunologic phenotypes in genetically modified mice to direct effects of CoA as a metabolic DAMP as opposed to indirect effects from a chronically altered immune system.

      Reviewer #2 (Significance (Required)): *This is a fairly straightforward manuscript that indicated CoA acts as a "weak" TLR4 agonist and primes macrophages for alternative activation. Overall, the experiments are well done and clear enough.

      *

      We reiterate our gratitude for the comments on the quality and clarity of our work.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: In this manuscript on enhancement of mIL-4 polarization by exogenous CoA, the authors follow up on their previous studies that had shown a correlation between Etomoxir-driven block in mIL-4 and a reduction of intracellular CoA levels. The results obtained (lack of enhancement of IL-4-induced changes in oxidative phosphorylation and glycolysis; lack of impact of pharmacological decrease/increase of intracellular CoA levels) led them to discard their initial hypothesis. Instead, the presence of a proinflammatory gene signature in macrophages treated with IL-4+CoA triggered experiments testing the involvement of TLR-Myd88 signaling and the identification of CoA as a weak agonist for TLR4 (which is consistent with a preprint manuscript posted in 2022 by others and showing induction of proinflammatory gene express in a TLR2/4-dependent manner).

      • Significance: Overall, these results are novel and interesting, although the use of yeast-derived CoA preparations raises a question about the contribution of contaminants that is only partially controlled by data obtained with a synthetic CoA. Regarding a biological role for CoA in macrophage biology in vivo, the authors propose that CoA may act as a DAMP upon release from dying/dead cells and thereby modify transcriptional polarization of m(IL-4). I have several comments related to specific experimental conditions and interpretation that should be addressed. Most importantly, the key findings of the manuscript should be demonstrated using synthetic CoA as described in comment #5. *

      We are heartened that the reviewer found our initial submission to be novel and interesting, and are grateful for their suggestions to reinforce our existing data with more studies comparing yeast-derived and synthetically-derived Coenzyme A. We have done our best to address each of the individual questions below:

      Major comments:

      1. *Increasing/decreasing intracellular CoA levels does not alter IL-4-induced CD206 expression (Fig. 2i/j. However, the impact of CoA addition to mIL-4 is stronger for Ccl8 and Mgl2 mRNA (Fig. 1a) than for the CD206+ cell fraction (Fig. 1d). Therefore, it would be better (higher sensitivity) to include expression of these genes as readout after CPCA/PZ-2891 treatment. *

      This is a helpful suggestion, and we have now conducted gene expression studies to complement our flow cytometry and mass spectrometry studies while manipulating the intracellular CoA pool. In line with our previous work, neither CPCA (which decreases intracellular free CoA) or PZ-2891 (which increases intracellular free CoA) meaningfully alter expression of IL-4-linked genes including Ccl8 or Mgl2. In fact, the only (statistically insignificant) trend refutes the hypothesis, as gene expression with CPCA leads to marginally increased gene expression. These results are now included in Supplemental Figure S2f. We thank the reviewer for this helpful suggestion, as it has strengthened our conclusion that intracellular CoA levels do not adjust the macrophage IL-4 response.

      • The CoA-induced proinflammatory gene expression in Fig. 3c is relatively weak (e.g. compared to LPS). The authors use CoA throughout the manuscript at a concentration of 1 mM, and we do not know how much of it is required to cause an effect at all. Therefore, dose-response curves for the stimulation of macrophages with titrated amounts of CoA should be provided. In addition, *

      We thank the reviewer for bringing up this point so we could clarify and add to our existing data. We should note that Supplemental Figures 1b&c of our previous submission (and resubmitted manuscript) detail a concentration-response curve showing that at little as 62.5 mM CoA – the lowest concentration tested – was sufficient to enhance IL-4 cell surface marker expression.

      However, it is an excellent suggestion as the reviewer notes, to conduct a similar concentration-response to determine if this lines up with CoA inducing a pro-inflammatory response. The full data set is presented in the answer to reviewer question 4 (comparing CoA purchased from Sigma vs. Avanti Polar Lipids), though we now show in Supplemental Figure S3 that 62.5 mM CoA is sufficient to elicit a pro-inflammatory response. Though it is indeed a weak effect as noted by the reviewer, our data suggest that the relatively mild stimulus is crucial for the effect. Given the results with the TLR3 agonist Poly I:C (Figure 5), which engages a Type 1 interferon response, strong TLR4 agonists that engage the TRIF/Type I interferon arm of the TLR4 response are likely to blunt or block the IL-4 response.

      • Related question: we are informed that the concentration of CoA in the mitochondrial matrix is 5mM, whereas cytosol contains 100µM. For CoA to act as DAMP, I would like to know the concentration of it in supernatants of cell cultures (live vs. dying/dead cells) and from tissues. *

      This is an important point brought up by the reviewer, and we agree that the implicit issue raised (i.e. “do the concentrations of CoA required to see an effect reconcile with a physiological role as a DAMP?”) should be more thoroughly addressed in the manuscript. Tissue concentrations of free CoA (in ng/mg tissue) are well established for mice and range from >100 nmol/g tissue (liver, heart, brown adipose tissue) to Nonetheless, the reviewer’s larger point is very well reasoned, and we address it in the following ways in the discussion on __Lines 378-391. __

      • In light of the reviewer’s comment, we now mention specific instances in the discussion where CoA acting as a DAMP may reasonably play a physiological role (e.g. acetaminophen-induced acute liver injury or other forms of sterile liver injury given that DAMPs are known to be important factors and liver tissue contains relatively high concentrations of CoA).
      • Although cytoplasmic concentrations of CoA may only be 50-100 mM, our work establishes a framework for how ubiquitous metabolic co-factors can activate pattern recognition receptors. Put another way, although CoA itself may not be a physiologically relevant DAMP, discovering this pathway could inform how other nucleotide or nucleoside analogs (e.g. adenine- or adenosine-containing molecules present at millimolar concentrations) exert their effects on innate immunity.
      • Our newly obtained data with HMDMs (Figure 1h) shows that the CoA response in human macrophages – boosting IL-4-linked gene expression by 10-100X – may be much stronger than the 1.5-5X effect observed in mouse BMDMs. As such, it is exciting to speculate that CoA may have a more potent effect on the IL-4 response in humans relative to mice. We trust the reviewer understands the limitations of obtaining human macrophages that preclude conducting a thorough concentration-response analysis given the restrictions of a manuscript revision.
      • It is very good that the authors validate the findings obtained using the yeast-derived CoA with the synthetic molecule. It is very conceivable that the 15% contaminating substances in the yeast CoA could be causing the observed changes in m(IL-4). The fact that synthetic CoA has higher activity in proinflammatory gene expression by BMM (Suppl. Fig. S3) is reassuring, however, it raises the question why this is the case. One possibility is that the concentrations of the different CoA preps cannot directly be compared. Therefore, dose response curves should also be provided for synthetic CoA. *

      This is an astute observation by the reviewer and we thank them for reading our manuscript with such detailed attention to pick this up. We are reassured that the reviewer shares our interpretation that the effect of CoA is not due to a contaminating TLR4 agonist in the yeast-derived preparation (from Sigma-Aldrich; ~85% pure) given a negative Limulus Test (Supplemental Figure S4b). Moreover, the synthetically-derived preparation (from Avanti Polar Lipids; ~99% pure) yields a stronger TLR4 response.

      An exploration of the follow-on question regarding why the effect is greater than 15% is presented below. These experiments have been added to Supplemental Figure S4c&d. The summary of our data suggests the individual concentrations indeed cannot be compared – matched concentrations of synthetic Avanti CoA have greater than a 15% effect than yeast-derived Sigma CoA. There are likely multiple factors that could explain this, some of which are listed below.

      • The physiological effect of a TLR agonist need not be linear with its concentration, as demonstrated by the sigmoidal calibration curves for the TLR-expressing HEK-blue cells (Figures 4b, S4a). This likely does not explain the dramatic difference between the two CoA preparations but is worth noting.
      • While we have determined that the 15% contaminating substances in the yeast-derived CoA are not causing the observed changes in the IL-4 response, it is formally possible that there are contaminating substances blunting the pro-inflammatory response and therefore limiting the effect of CoA purchased from Sigma-Aldrich relative to that from Avanti Polar Lipids. Importantly, however, our data in response to Reviewer Question #5 show there is no difference in amplifying the IL-4 response between the yeast- and synthetically-derived CoA.
      • The difference in activity of yeast and synth. CoA could also be caused by the additional biologically active molecules in the yeast CoA. Therefore, it is important to show that the key findings in the paper (enhancement of m(IL-4) associated gene expression and CD206+ upregulation in vitro and in vivo) are also induced by synth. CoA. This is even more important in the context of the Myd88-independence of CD206+ upregulation in BMM treated with CoA (Suppl. Fig. S4). The experiment should be repeated with synth. CoA. If the enhancement of CD206+ cells induced by CoA is indeed unchanged in Myd88 KO BMM, then the title of the manuscript "CoA enhances alternative macrophage activation via Myd88" would not be supported by the data and needed to be changed. Activation of the TLR4 reporter cell line should also be tested using the synth. CoA molecule.*

      We are grateful for this suggestion by the reviewer to further cement the idea that our observation of CoA enhancing the macrophage IL-4 response was not due to a contaminant in the Sigma-Aldrich CoA preparation. The reviewer makes a few points in this question which we address individually here.

      • The suggestion to confirm that the CoA-induced enhancement of M(IL-4) is not due to a contaminating substance in the Sigma-Aldrich CoA is excellent and necessary. Here we show that synthetically derived CoA (99% pure, purchased from Avanti Polar Lipids) quantitatively reproduces the effect from yeast-derived CoA from Sigma-Aldrich in Supplemental Figure S4e. The response is noteworthy because synthetic CoA has profoundly stronger pro-inflammatory response than yeast-derived CoA, yet both have a similar effect on augmenting M(IL-4). This suggests that any appropriate pro-inflammatory response – irrespective of the relative strength or weakness – is sufficient to maximize the effect. This can also be observed with the range of MyD88-linked TLR agonists used in Figures 5 and S6a.
      • Similarly, we also conducted experiments to show that the effect of synthetic CoA on M(IL-4) is independent of MyD88 similarly to yeast-derived CoA. These data are present in Supplemental Figure S6b&c. Here again, we should note that the effect of synthetic CoA is quantitatively similar to the effect of yeast CoA and Imiquimod (Supplemental Figure S6a).
      • Activation of the TLR4 reporter cell line is available in Supplemental Figure S4c.
      • Regarding the title of the manuscript, we acknowledge that we struggled a bit with how to frame our findings. Importantly, our findings support a model where (i) CoA provision enhances the IL-4 response not via metabolic changes but rather by acting as a mild pro-inflammatory stimulus, and (ii) MyD88 signaling augments the IL-4 response. We should also note that our findings simply show that CoA does not exclusively enhance the IL-4 response via MyD88 signaling, and there may be other redundant pathways (similarly to MyD88 agonist imiquimod but unlike the MyD88 agonists Pam3-CSK4 and low concentrations of LPS). We are open to working journal editors to strike the right balance of scientific accuracy and representation of the work when deciding on a final title.
      • The results from the tumor model in Fig. 5 are presented to show a stronger tumor-promoting effect of m(IL-4) stimulated with Pam3. However, the variability of the data is high and 2 out of 6 mice in the +Pam3 group appear to actually have a lower tumor weight than the control mice. Therefore, these data are quite superficial and preliminary, and would benefit from a replicate experiment. Furthermore, for the evaluation of CoA as a biologically relevant DAMP, it would be important to know whether CoA-treated m(IL-4) show the same tumor-promoting effect in vivo as Pam3. *

      We thank the reviewer for their comment, and agree that our in vivo work is indeed preliminary. Our goal with this report was to focus on the initial discovery of this molecular pathway and its first, broad characterization using a range of techniques (e.g. in vivo outcomes, ATAC-Seq, etc.), many of which can spur more detailed follow-up studies for future papers. As detailed in the manuscript discussion (Lines 415-419), future work beyond our initial discovery is warranted to thoroughly explore the physiological outcomes of CoA as a metabolic DAMP in relevant model systems such as acute liver injury. As an initial proof-of-concept to show that MyD88 signaling can enhance alternative activation, however, we believe our two discrete experiments (sterile inflammation and tumor formation) are sufficient to indicate the phenotype is likely relevant in animal models. In vivo syngeneic tumor models display natural variability in tumor size due to differences in implantation efficiency, host immune responses, and tumor-intrinsic growth kinetics. Nonetheless, our statistical analysis demonstrates that, with high confidence, that the observed differences are reproducible and not attributable to random variation.

      Minor comments:

        • Fig. 1b: where the gates for CD206/CD301 set based on isotype control stainings? *

      We thank the reviewer for pointing out this oversight in our methods. The gates were indeed set on isotype control stains, and this is now mentioned in Lines 519-521 of the revised manuscript.

      The formatting not cohesive m(IL-4) vs. M(IL-4)

      Again, this is an embarrassing oversight on our part and we have done our very best to copy edit the piece and remove any inconsistencies and errors.

      *Methods: primer sequences are not shown. They should be provided. *

      We thank the reviewer for pointing this out, and now include all primer sequences used in Supplemental Table 1 of the revised manuscript.

      Description of flowcytometry (L/D staining after surface? No washing steps after addition of L/D staining)

      We thank the reviewer for pointing out another oversight in our methods, and have provided a more detailed description of the flow cytometric analysis in Lines 509-521 of the revised manuscript.

      Statistics: the methods section states that variability is indicated by SD, but the Figure legends always mention SEM. Please correct.

      We are grateful for the reviewer’s helpful attention to detail, and have corrected the methods to line up with the figure legends.

      *A multitude of typos and editorial inconsistencies (e.g. spelling of m(IL-4), punctation and capitalization) should be corrected/streamlined. *

      We are grateful for the reviewer’s helpful attention to detail, and have done our best to copy edit the manuscript prior to resubmission.

      Reviewer #3 (Significance (Required)):

      strengths: I like that the authors follow up their previous work on Etomoxir and CoA, now finding again an unexpected twist in how the effect on m(IL-4) is brought about. This makes the story more complicated, but is important to get to a more precise and realistic understanding of metabolic and transcriptomic regulation and how they are interconnected (or not). In addition, the use of a relatively broad set of methods including ATACseq and mass spectrometry is a strength.

      weakness: the use of the not very pure yeast derived CoA prep, which is controlled for induction of proinflammatory cytokines by one experiment with synth. CoA. This validation needs to be expanded (see comments above) to substantiate the main message of the manuscript.

      The scope of the manuscript is quite focussed on the mechanism of CoA enhanced m(IL-4). The finding that CoA appears not to act by changing intracellular macrophage metabolism but instead after its release by activation TLR4 widens the scope and suggests a new function for CoA as DAMP. This aspect would need to be further substantiated to be convincing.

      Audience: scientists working at the intersection between metabolism and innate immunity will be interested in the results.

      We thank the reviewer for their kind comments regarding the precision, credibility, and breadth of our manuscript. We hope they find our revised manuscript an improvement over our previous submission regarding both the new experiments and modified text. The comments have undoubtedly improved our manuscript and we are grateful to the reviewer for the considerable effort they put into reading our submission.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript on enhancement of mIL-4 polarization by exogenous CoA, the authors follow up on their previous studies that had shown a correlation between Etomoxir-driven block in mIL-4 and a reduction of intracellular CoA levels. The results obtained (lack of enhancement of IL-4-induced changes in oxidative phosphorylation and glycolysis; lack of impact of pharmacological decrease/increase of intracellular CoA levels) led them to discard their initial hypothesis. Instead, the presence of a proinflammatory gene signature in macrophages treated with IL-4+CoA triggered experiments testing the involvement of TLR-Myd88 signaling and the identification of CoA as a weak agonist for TLR4 (which is consistent with a preprint manuscript posted in 2022 by others and showing induction of proinflammatory gene express in a TLR2/4-dependent manner).

      Significance:

      Overall, these results are novel and interesting, although the use of yeast-derived CoA preparations raises a question about the contribution of contaminants that is only partially controlled by data obtained with a synthetic CoA. Regarding a biological role for CoA in macrophage biology in vivo, the authors propose that CoA may act as a DAMP upon release from dying/dead cells and thereby modify transcriptional polarization of m(IL-4). I have several comments related to specific experimental conditions and interpretation that should be addressed. Most importantly, the key findings of the manuscript should be demonstrated using synthetic CoA as described in comment #5.

      Major comments:

      1. Increasing/decreasing intracellular CoA levels does not alter IL-4-induced CD206 expression (Fig. 2i/j. However, the impact of CoA addition to mIL-4 is stronger for Ccl8 and Mgl2 mRNA (Fig. 1a) than for the CD206+ cell fraction (Fig. 1d). Therefore, it would be better (higher sensitivity) to include expression of these genes as readout after CPCA/PZ-2891 treatment.
      2. The CoA-induced proinflammatory gene expression in Fig. 3c is relatively weak (e.g. compared to LPS). The authors use CoA throughout the manuscript at a concentration of 1 mM, and we do not know how much of it is required to cause an effect at all. Therefore, dose-response curves for the stimulation of macrophages with titrated amounts of CoA should be provided. In addition,
      3. Related question: we are informed that the concentration of CoA in the mitochondrial matrix is 5mM, whereas cytosol contains 100µM. For CoA to act as DAMP, I would like to know the concentration of it in supernatants of cell cultures (live vs. dying/dead cells) and from tissues.
      4. It is very good that the authors validate the findings obtained using the yeast-derived CoA with the synthetic molecule. It is very conceivable that the 15% contaminating substances in the yeast CoA could be causing the observed changes in m(IL-4). The fact that synthetic CoA has higher activity in proinflammatory gene expression by BMM (Suppl. Fig. S3) is reassuring, however, it raises the question why this is the case. One possibility is that the concentrations of the different CoA preps cannot directly be compared. Therefore, dose response curves should also be provided for synthetic CoA.
      5. The difference in activity of yeast and synth. CoA could also be caused by the additional biologically active molecules in the yeast CoA. Therefore, it is important to show that the key findings in the paper (enhancement of m(IL-4) associated gene expression and CD206+ upregulation in vitro and in vivo) are also induced by synth. CoA. This is even more important in the context of the Myd88-independence of CD206+ upregulation in BMM treated with CoA (Suppl. Fig. S4). The experiment should be repeated with synth. CoA. If the enhancement of CD206+ cells induced by CoA is indeed unchanged in Myd88 KO BMM, then the title of the manuscript "CoA enhances alternative macrophage activation via Myd88" would not be supported by the data and needed to be changed. Activation of the TLR4 reporter cell line should also be tested using the synth. CoA molecule.
      6. The results from the tumor model in Fig. 5 are presented to show a stronger tumor-promoting effect of m(IL-4) stimulated with Pam3. However, the variability of the data is high and 2 out of 6 mice in the +Pam3 group appear to actually have a lower tumor weight than the control mice. Therefore, these data are quite superficial and preliminary, and would benefit from a replicate experiment. Furthermore, for the evaluation of CoA as a biologically relevant DAMP, it would be important to know whether CoA-treated m(IL-4) show the same tumor-promoting effect in vivo as Pam3.

      Minor comments:

      1. Fig. 1b: where the gates for CD206/CD301 set based on isotype control stainings?
      2. The formatting not cohesive m(IL-4) vs. M(IL-4)
      3. Methods: primer sequences are not shown. They should be provided.
      4. Description of flowcytometry (L/D staining after surface? No washing steps after addition of L/D staining)
      5. Statistics: the methods section states that variability is indicated by SD, but the Figure legends always mention SEM. Please correct.
      6. A multitude of typos and editorial inconsistencies (e.g. spelling of m(IL-4), punctation and capitalization) should be corrected/streamlined.

      Significance

      Strengths: I like that the authors follow up their previous work on Etomoxir and CoA, now finding again an unexpected twist in how the effect on m(IL-4) is brought about. This makes the story more complicated, but is important to get to a more precise and realistic understanding of metabolic and transcriptomic regulation and how they are interconnected (or not). In addition, the use of a relatively broad set of methods including ATACseq and mass spectrometry is a strength.

      Weakness: the use of the not very pure yeast derived CoA prep, which is controlled for induction of proinflammatory cytokines by one experiment with synth. CoA. This validation needs to be expanded (see comments above) to substantiate the main message of the manuscript.

      The scope of the manuscript is quite focussed on the mechanism of CoA enhanced m(IL-4). The finding that CoA appears not to act by changing intracellular macrophage metabolism but instead after its release by activation TLR4 widens the scope and suggests a new function for CoA as DAMP. This aspect would need to be further substantiated to be convincing.

      Audience: scientists working at the intersection between metabolism and innate immunity will be interested in the results.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript "Realistic coupling enables flexible macroscopic traveling waves in the mouse cortex" by Sun, Forger, and colleagues presents a novel computational framework for studying macroscopic traveling waves in the mouse cortex by integrating realistic brain connectivity data with large-scale neural simulations.

      The key contributions include:<br /> (1) developing an algorithm that combines spatial transcriptomic data (providing detailed neuron positions and molecular properties) with voxelized connectivity data from the Allen Brain Atlas to construct neuron-to-neuron connections across ~300,000 cortical neurons;<br /> (2) building a GPU-accelerated simulation platform capable of modeling this large-scale network with both excitatory and inhibitory Hodgkin-Huxley neurons;<br /> (3) extending phase-based analysis methods from 2D to 3D to quantify traveling wave activity in the realistic brain geometry; and<br /> (4) demonstrating that realistic Allen connectivity generates significantly higher levels of macroscopic traveling waves compared to simplified local or uniform connectivity patterns.

      The study reveals that wave activity depends non-monotonically on coupling strength and that slow oscillations (0.5-4 Hz) are particularly conducive to large-scale wave propagation, providing new insights into how anatomical connectivity enables flexible spatiotemporal dynamics across the cortex.

      Strengths:

      The authors leverage two existing dense datasets of spatial transcriptomic data and connection strength between pairwise voxels in the mouse cortex in a novel way, allowing for the computational model to capture molecular and functional properties of neurons as determined by their neurotransmitter profiles, rather than making arbitrary assignments of excitatory/inhibitory roles. Additionally, the author's expansion of 2D phase dynamics to 3D phase gradient analysis methods is important and can be widely applied to calcium imaging, LFP recordings, and likely other electrophysiological recordings.

      Weaknesses:

      Despite these important computational advancements, a few aspects of this model, particularly the inability to validate the model with experimental neural data, diminish my enthusiasm for this paper:

      (1) The model's Allen connectivity approach overlooks critical aspects of real cortical dynamics. Most importantly, it excludes subcortical structures, especially the thalamus, which drives cortical traveling waves through thalamocortical interactions. The authors' method of electrically stimulating all layer 4 neurons simultaneously to initiate waves is artificially crude and bears little resemblance to natural wave generation mechanisms.

      (2) The model handles voxel-to-voxel connections crudely when neurons have mixed excitatory/inhibitory properties and varying synaptic strengths. Real connectivity differs dramatically between neuron types (pyramidal cells vs. interneurons, across cortical layers), but the model only distinguishes excitatory and inhibitory neurons. Additionally, uniform synaptic weights ignore natural variations in connection strength based on neuron type, distance, and functional role. Integrating the updated thalamocortical dataset mentioned by the authors, even at regional resolution, would substantially improve the model.

      (3) While the authors bridge microscopic (single neuron) and mesoscopic (regional connectivity) data to study macroscopic (whole-cortex) waves, they don't integrate the distinct mechanisms operating at each scale. The framework demonstrates that realistic connectivity enables macroscopic waves but fails to connect how wave dynamics emerge and interact across spatial scales systematically.

      (4) Claims that Allen connectivity produces higher phase gradient directionality (PGD) than local connectivity appear limited to delta oscillations at very specific coupling strengths and applied currents. Few parameter combinations show significantly higher PGD for Allen connectivity, and these are generally low PGD values overall.

      (5) Broadly, it's unclear how this computational framework can study memory, learning, sleep, sensory processing, or disease states, given the disconnect between simulated intracellular voltages and the local field potentials or other electrophysiological measurements typically used to study cortical traveling waves. While computationally impressive, the practical research applications remain vague.

      (6) The paper needs a clearer explanation for why medium coupling (100%) eliminates waves in Allen connectivity (Figure 6) while stronger coupling (150%) restores them.

      (7) Does using a single connectivity parameter (ρ = 300) across all regions miss important regional differences in cortical connectivity density?

    2. Reviewer #2 (Public review):

      Summary:

      This work presents a spiking network model of traveling waves at the whole-brain scale in the mouse neocortex. The authors use data from the Allen Institute to reconstruct connectivity between different neocortical sites. They then quantify macroscopic traveling waves following stimulation of all layer 4 neurons in the neocortex.

      Strengths:

      Overall, the results are interesting and shed new light on the dynamic organization of activity across the neocortex of the mouse. The paper uses realistic neuron models specifically fit to intracellular recordings, demonstrating that traveling waves occur in the mouse neocortex with both realistic connectivity and realistic single-neuron dynamics. The paper is also well-written in general. For these reasons, the authors have generally achieved their aims in this work.

      Weaknesses:

      (1) Description of Algorithm 1:<br /> While the Methods section clearly explains the density parameter \rho, the statement on line 358 concerning the "ideal" average number of connections is a little unclear. The authors should explicitly clarify that \rho is a free parameter that can be adjusted to balance computational feasibility (for a given set of computational resources) and biological fidelity.

      (2) Lines 102-103:<br /> The \rho parameter used here results in approximately 300 connections per neuron on average. The authors should state clearly that the number of connections per cell is the key determinant of computational feasibility (cf. Morrison et al., Neural Computation, 2005). The authors should also review neuronal density and synaptic connectivity in the mouse neocortex and clearly reference density and connectivity in their model to the biological scales found in the mouse.

      (3) Line 131:<br /> From the plots in Figure 2, it is not clear that the stimulus response is necessarily a rhythmic oscillation, in the sense of a single narrowband frequency.

      (4) Line 217:<br /> The authors should clarify how these findings relate to the results from Mohajerani et al. (Nature Neuroscience, 2013) or differ from them.

      (5) Line 230:<br /> Because higher temporal frequency activity also tends to be more spatially localized, a correlation between PGD and temporal frequency could be an inherent consequence of this relationship, rather than a meaningful result.

      (6) Line 247-248:<br /> It is not clear that the algorithm for generating connections between neurons presented here really relates to those for community detection. For example, in the case of the Allen Institute data, the communities are essentially in the data already.

      (7) Line 284-285:<br /> The relationship between conduction delay is more direct than this sentence suggests. Conduction delay is fundamentally determined by the time required for action potentials to propagate along axons, making it intrinsically linked to anatomical distance.

      (8) Line 287-288:<br /> The authors suggest at this point that they do not have enough information to estimate time delays due to axonal conduction along white matter fibers. However, experimental data from white matter connections typically includes information about fiber length, which does enable estimating conduction delays. These estimations have been previously implemented for Allen Institute connectome data in the mouse (Choi and Mihalas, PLoS Comput Biology, 2019) and human connectome data (Budzinski et al., Physical Review Research, 2023).

      (9) Lines 294-295:<br /> Several methods do exist for detecting and characterizing wave dynamics in three-dimensional data (Budzinski et al., Physical Review Research, 2023).

    1. Reviewer #1 (Public review):

      Summary: This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Weaknesses:

      (1) Clarity of the VGC model and behavioral task: The exposition of the VGC model lacks sufficient detail for non-expert readers. It is not clear how this model infers which maze obstacles are relevant or irrelevant for planning, nor how the maze tasks specifically operationalize "planning" versus other cognitive processes.

      The method for classifying obstacles as relevant or irrelevant to the task and connecting metacognitive awareness (i.e., participants' reports of noticing obstacles) to attentional capture is not well justified. The rationale for why awareness serves as a valid attention proxy, as opposed to behavioral or neurophysiological markers, should be clearer.

      (2) Attention framework: The account of attention is largely limited to the "spotlight" model. When solving a maze, participants trace the correct trail, following it mentally with their overt or covert attention. In this perspective, relevant concepts are also rooted in attention literature pertaining to object-based attention using tasks like curve tracing (e.g., Pooresmaeili & Roelfsema, 2014) and to mental maze solving (e.g., Wong & Scholl, 2024), which may be highly relevant and add nuance to the current work. This view of attention may be more pertinent to the task than models of simultaneously tracking multiple objects cited here. Prior work (notably from the Roelfsema group) indicates that attentional engagement in curve-tracing tasks may be a continuous, bottom-up process that progressively spreads along a trajectory, in time and space, rather than a "spotlight" that simply travels along the path. The spread of attention depends on the spatial proximity to distractors - a point that could also be pertinent to the findings here.

      Moreover, the tracing of a "solution" trail in a maze may be spontaneous and not only a top-down voluntary operation (Wong & Scholl, 2024), a finding that requires a more careful framing of the link to conscious perception discussed in the manuscript.

      Conceptualizing attention as a spatial spotlight may therefore oversimplify its role in navigation and planning. Perhaps the observed attentional modulation reflects a perceptual stage of building the trail in the maze rather than a filter for a later representation for more efficient decision making and planning. A fuller discussion of whether the current model and data can distinguish between these frameworks would benefit readers.

      (3) Lateralization of attention: The analysis considers whether relevant information is distributed bilaterally or unilaterally across the visual display, but does not sufficiently address evidence for attentional asymmetries across the left and right visual fields due to hemispheric specialization (e.g., Bartolomeo & Seidel Malkinson, 2019). Whether effects differ for left versus right hemifield arrangements is not made explicit in the presented findings.

      (4) Individual differences: Individual differences in attentional modulation are a strength of the work, but similar analyses exploring individual variation in lateralization effects could provide further insight, and the lack of such analyses may mask important effects.

      (5) Distinction between overt and covert attention: The current report at times equates eye movement patterns with the locus of attention. However, attention can be covertly shifted without corresponding gaze changes (see, for example, Pooresmaeili & Roelfsema, 2014).

      The implications for interpreting the relationship between eye movement, memory, and attention in this setting are not fully addressed. The potential dynamics of attention along a maze trajectory and their impact on lateralization analysis would benefit from further clarification.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      However, some conceptual and methodological aspects would be clearer with greater engagement with the broader literature on attention dynamics, a more explicit justification of operational choices, and more targeted lateralization analyses.

    2. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Weaknesses:

      (1) The spotlight-VGC model has a free parameter - the "width" of the attentional spotlight. This seems to have been fixed to be 3 squares. It would be good if the authors could describe a more principled procedure for selecting the width so that others can use the model in other contexts.

      (2) Have the authors considered other ways in which factors such as attentional spillover and lateralization could be incorporated into the model? The spotlight-VGC model, as presented, involves first computing VGC predictions and only afterwards computing spillover. This seems psychologically implausible, since it supposes that the "optimal" representation is first formed and then it gets corrupted. Is there a way to integrate these biases directly into the VGC framework, perhaps as a prior on construals? The authors gesture towards this when they talk about "inductive biases", but this is not formalized.

      (3) Can the authors rule out that the lateralization effects are the result of memory biases since the main measure used is a self-report of attention?

  2. teacher.imperial-english.com teacher.imperial-english.com
    1. Look at the answers to some questions about people’s daily lifestyles. Write the questions for each answer.

      Question 1) (How / many) How many siblings does she have?

      Question 2) (Where) Where do your parents live?

      Question 3) (Who) Who was your role model growing up?

      Question 4) (How / much) How much does a black coffee without milk cost?

      Question 5) (Why) Why did you choose Loughborough University?

    1. ________________________________________________________
      1. I anticipate 4 years. 2.I/m not really sure how many I will need. 3.Turning in all of my assignments. 4.Yes I am confident.
    2. ________________________________________________________
      1. I anticipate being in college for 2-4 years to become a labor and delivery nurse.
      2. I would probably have to take about 3-5 per term to finish in my time period. 3.I think the most difficult part will be making time for everything
      3. I do believe I can overcome any difficulties that will come my way.
    3. ________________________________________________________

      1) I plan on graduate in 2 to 3 years and go into culinary school 2) 5 or more 3) Math and the mental struggle 4) yes, and ik i can

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-02922

      Corresponding author(s): Christian Specht

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      • *

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      • *

      We thank the reviewers for their thorough and constructive evaluation of our work. We have revised the manuscript carefully and addressed all the criticisms raised, in particular the issues mentioned by several of the reviewers (see point-by-point response below). We have also added a number of explanations in the text for the sake of clarity, while trying to keep the manuscript as concise as possible.

      • *

      In our view, the novelty of our research is two-fold. From a neurobiological point of view, we provide conclusive evidence for the existence of glycine receptors (GlyRs) at inhibitory synapses in various brain regions including the hippocampus, dentate gyrus and sub-regions of the striatum. This solves several open questions and has fundamental implications for our understanding of the organisation and function of inhibitory synapses in the telencephalon. Secondly, our study makes use of the unique sensitivity of single molecule localisation microscopy (SMLM) to identify low protein copy numbers. This is a new way to think about SMLM as it goes beyond a mere structural characterisation and towards a quantitative assessment of synaptic protein assemblies.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      • *

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      In this manuscript, the authors investigate the nanoscopic distribution of glycine receptor subunits in the hippocampus, dorsal striatum, and ventral striatum of the mouse brain using single-molecule localization microscopy (SMLM). They demonstrate that only a small number of glycine receptors are localized at hippocampal inhibitory synapses. Using dual-color SMLM, they further show that clusters of glycine receptors are predominantly localized within gephyrin-positive synapses. A comparison between the dorsal and ventral striatum reveals that the ventral striatum contains approximately eight times more glycine receptors and this finding is consistent with electrophysiological data on postsynaptic inhibitory currents. Finally, using cultured hippocampal neurons, they examine the differential synaptic localization of glycine receptor subunits (α1, α2, and β). This study is significant as it provides insights into the nanoscopic localization patterns of glycine receptors in brain regions where this protein is expressed at low levels. Additionally, the study demonstrates the different localization patterns of GlyR in distinct striatal regions and its physiological relevance using SMLM and electrophysiological experiments. However, several concerns should be addressed.

      The following are specific comments:

      1. Colocalization analysis in Figure 1A. The colocalization between Sylite and mEos-GlyRβ appears to be quite low. It is essential to assess whether the observed colocalization is not due to random overlap. The authors should consider quantifying colocalization using statistical methods, such as a pixel shift analysis, to determine whether colocalization frequencies remain similar after artificially displacing one of the channels. *Following the suggestion of reviewer 1, we re-analysed CA3 images of Glrbeos/eos hippocampal slices by applying a pixel-shift type of control, in which the Sylite channel (in far red) was horizontally flipped relative to the mEos4b-GlyRβ channel (in green, see Methods). As expected, the number of mEos4b-GlyRβ detections per gephyrin cluster was markedly reduced compared to the original analysis (revised__ Fig. 1B__), confirming that the synaptic mEos4b detections exceed chance levels (see page 5). *

      Inconsistency between Figure 3A and 3B. While Figure 3B indicates an ~8-fold difference in the number of mEos4b-GlyRβ detections per synapse between the dorsal and ventral striatum, Figure 3A does not appear to show a pronounced difference in the localization of mEos4b-GlyRβ on Sylite puncta between these two regions. If the images presented in Figure 3A are not representative, the authors should consider replacing them with more representative examples or providing an expanded images with multiple representative examples. Alternatively, if this inconsistency can be explained by differences in spot density within clusters, the authors should explain that.

      *The pointillist images in Fig. 3A are essentially binary (red-black). Therefore, the density of detections at synapses cannot be easily judged by eye. For clarity, the original images in Fig. 3A have been replaced with two other examples that better reflect the different detection numbers in the dorsal and ventral striatum. *

      • *

      Quantification in Figure 5. It is recommended that the authors provide quantitative data on cluster formation and colocalization with Sylite puncta in Figure 5 to support their qualitative observations.

      *This is an important point that was also raised by the other reviewers. We have performed additional experiments to increase the data volume for analysis. For quantification, we used two approaches. First, we counted the percentage of infected cells in which synaptic localisation of the recombinant receptor subunit was observed (Fig. 5C). We found that mEos4b-GlyRa1 consistently localises at synapses, indicating that all cells express endogenous GlyRb. When neurons were infected with mEos4b-GlyRb, fewer cells had synaptic clusters, meaning that indeed, GlyR alpha subunits are the limiting factor for synaptic targeting. In cultures infected with mEos4b-GlyRa2, only very few neurons displayed synaptic localisation (as judged by epifluorescence imaging). We think this shows that GlyRa2 is less capable of forming heteromeric complexes than GlyRa1, in line with our previous interpretation (see pp. 9-10, 13). *

      • *

      Secondly, we quantified the total intensity of each subunit at gephyrin-positive domains, both in infected neurons as well as non-infected control cultures (Fig. 5D). We observed that mEos4b-GlyRa1 intensity at gephyrin puncta was higher than that of the other subunits, again pointing to efficient synaptic targeting of GlyRa1. Gephyrin cluster intensities (Sylite labelling) were not significantly different in GlyRb and GlyRa2 expressing neurons compared to the uninfected control, indicating that the lentiviral expression of recombinant subunits does not fundamentally alter the size of mixed inhibitory synapses in hippocampal neurons. Interestingly, gephyrin levels were slightly higher in hippocampal neurons expressing mEos4b-GlyRa1. In our view, this comes from an enhanced expression and synaptic targeting of mEos4b-GlyRa1 heteromers with endogenous GlyRb, pointing to a structural role of GlyRa1/b in hippocampal synapses (pp. 10, 13).

      • *

      The new data and analyses have been described and illustrated in the relevant sections of the manuscript.

      Potential for pseudo replication. It's not clear whether they're performing stats tests across biological replica, images, or even synapses. They often quote mean +/- SEM with n = 1000s, and so does that mean they're doing tests on those 1000s? Need to clarify.

      All experiments were repeated at least twice to ensure reproducibility (N independent experiments). Statistical tests were performed on pooled data across the biological replicates; n denotes the number of data points used for testing (e.g., number of synaptic clusters, detections, cells, as specified in each case). We have systematically given these numbers in the revised manuscript (n, N, and other experimental parameters such as the number of animals used, coverslips, images or cells). Data are generally given as mean +/- SEM or as mean +/- SD as indicated.

      • *

      Does mEoS effect expression levels or function of the protein? Can't see any experiments done to confirm this. Could suggest WB on homogenate, or mass spec?

      The Glrbeos/eos knock-in mouse line has been characterised previously and does not to display any ultrastructural or functional deficits at inhibitory synapses (Maynard et al. 2021 eLife). GlyRβ expression and glycine-evoked responses were not significantly different to those of the wild-type. The synaptic localisation of mEos4b-GlyRb in KI animals demonstrates correct assembly of heteromeric GlyRs and synaptic targeting. Accordingly, the animals do not display any obvious phenotype. We have clarified this in the manuscript (p. 4). In the case of cultured neurons, long-term expression of fluorescent receptor subunits with lentivirus has proven ideal to achieve efficient synaptic targeting. The low and continuous supply of recombinant receptors ensures assembly with endogenous subunits to form heteropentameric receptor complexes (e.g. [Patrizio et al. 2017 Sci Rep]). In the present study, lentivirus infection did not induce any obvious differences in the number or size of inhibitory synapses compared to control neurons, as judged by Sylite labelling of synaptic gephyrin puncta (new__ Fig. 5D__).

      Quantification of protein numbers is challenging with SMLM. Issues include i) some of FP not correctly folded/mature, and ii) dependence of localisation rate on instrument, excitation/illumination intensities, and also the thresholds used in analysis. Can the authors compare with another protein that has known expression levels- e.g. PSD95? This is quite an ask, but if they could show copy number of something known to compare with, it would be useful.

      We agree that absolute quantification with SMLM is challenging, since the number of detections depends on fluorophore maturation, photophysics, imaging conditions, and analysis thresholds (discussed in Patrizio & Specht 2016, Neurophotonics). For this reason, only very few datasets provide reliable copy numbers, even for well-studied proteins such as PSD-95. One notable exception is the study by Maynard et al. (eLife 2021) that quantified endogenous GlyRb-containing receptors in spinal cord synapses using SMLM combined with correlative electron microscopy. The strength of this work was the use of a KI mouse strain, which ensures that mEos4b-GlyRb expression follows intrinsic regional and temporal profiles. The authors reported a stereotypic density of ~2,000 GlyRs/µm² at synapses, corresponding to ~120 receptors per synapse in the dorsal horn and ~240 in the ventral horn, taking into account various parameters including receptor stoichiometry and the functionality of the fluorophore. These values are very close to our own calculations of GlyR numbers at spinal cord synapses that were obtained slightly differently in terms of sample preparation, microscope setup, imaging conditions, and data analysis, lending support to our experimental approach. Nevertheless, the obtained GlyR copy numbers at hippocampal synapses clearly have to be taken as estimates rather than precise figures, because the number of detections from a single mEos4b fluorophore can vary substantially, meaning that the fluorophores are not represented equally in pointillist images. This can affect the copy number calculation for a specific synapse, in particular when the numbers are low (e.g. in hippocampus), however, it should not alter the average number of detections (Fig. 1B) or the (median) molecule numbers of the entire population of synapses (Fig. 1C). We have discussed the limitations of our approach (p. 11).

      Rationale for doing nanobody dSTORM not clear at all. They don't explain the reason for doing the dSTORM experiments. Why not just rely on PALM for coincidence measurements, rather than tagging mEoS with a nanobody, and then doing dSTORM with that? Can they explain? Is it to get extra localisations- i.e. multiple per nanobody? If so, localising same FP multiple times wouldn't improve resolution. Also, no controls for nanobody dSTORM experiments- what about non-spec nb, or use on WT sections?

      *As discussed above (point 6), the detection of fluorophores with SMLM is influenced by many parameters, not least the noise produced by emitting molecules other than the fluorophore used for labelling. Our study is exceptional in that it attempts to identify extremely low molecule numbers (down to 1). To verify that the detections obtained with PALM correspond to mEos4b, we conducted robust control experiments (including pixel-shift as suggested by the reviewer, see point 1, revised__ Fig. 1B__). The rationale for the nanobody-based dSTORM experiments was twofold: (1) to have an independent readout of the presence of low-copy GlyRs at inhibitory synapses and (2) to analyse the nanoscale organisation of GlyRs relative to the synaptic gephyrin scaffold using dual-colour dSTORM with spectral demixing (see p. 6). The organic fluorophores used in dSTORM (AF647, CF680) ensure high photon counts, essential for reliable co-localisation and distance analysis. PALM and dSTORM cannot be combined in dual-colour mode, as they require different buffers and imaging conditions. *

      The specificity of the anti-Eos nanobody was demonstrated by immunohistochemistry in spinal cord cultures expressing mEos4b-GlyRb and wildtype control tissue (Fig. S3). In response to the reviewer's remarks, we also performed a negative control experiment in Glrbeos/eos slices (dSTORM), in which the nanobody was omitted (new__ Fig. S4F,G__). Under these conditions, spectral demixing produced a single peak corresponding to CF680 (gephyrin) without any AF647 contribution (Fig. S4F). The background detection of "false" AF647 detections at synapses was significantly lower than in the slices labelled with the nanobody. We conclude that the fluorescence signal observed in our dual-colour dSTORM experiments arises from the specific detection of mEos4b-GlyRb by the nanobody, rather than from background, cross-reactivity or wrong attribution of colour during spectral demixing. We have added these data and explanations in the results (p. 7) and in the figure legend of Fig. S4F,G.

      What resolutions/precisions were obtained in SMLM experiments? Should perform Fourier Ring Correlation (FRC) on SR images to state resolutions obtained (particularly useful for when they're presenting distance histograms, as this will be dependent on resolution). Likewise for precision, what was mean precision? Can they show histograms of localisation precision.

      This is an interesting question in the context of our experiments with low-copy GlyRs, since the spatial resolution of SMLM is limited also by the density of molecules, i.e. the sampling of the structure in question (Nyquist-Shannon criterion). Accordingly, the priority of the PALM experiments was to improve the sensibility of SMLM for the identification of mEos4b-GlyRb subunits, rather than to maximize the spatial resolution. The mean localisation precision in PALM was 33 +/- 12 nm, as calculated from the fitting parameters of each detection (Zeiss, ZEN software), which ultimately result from their signal-to-noise ratio. This is a relatively low precision for SMLM, which can be explained by the low brightness of mEos4b compared to organic fluorophores together with the elevated fluorescence background in tissue slices.

      • *

      In the case of dSTORM, the aim was to study the relative distribution of GlyRs within the synaptic scaffold, for which a higher localisation precision was required (p. 6). Therefore, detections with a precision ≥ 25 nm were filtered during analysis with NEO software (Abbelight). The retained detections had a mean localisation precision of 12 +/- 5 for CF680 (Sylite) and 11 +/- 4 for AF647 (nanobody). These values are given in the revised manuscript (pp. 18, 22).

      Why were DBSCAN parameters selected? How can they rule out multiple localisations per fluor? If low copy numbers (

      Multiple detections of the same fluorophore are intrinsic to dSTORM imaging and have not been eliminated from the analysis. Small clusters of detections likely represent individual molecules (e.g. single receptors in the extrasynaptic regions, Fig. 2A). DBSCAN is a robust clustering method that is quite insensitive to minor changes in the choice of parameters. For dSTORM of synaptic gephyrin clusters (CF680), a relatively low length (80 nm radius) together with a high number of detections (≥ 50 neighbours) were chosen to reconstruct the postsynaptic domain with high spatial resolution (see point 8). In the case of the GlyR (nanobody-AF647), the clustering was done mostly for practical reasons, as it provided the coordinates of the centre of mass of the detections. The low stringency of this clustering (200 nm radius, ≥ 5 neighbours) effectively filters single detections that can result from background noise or incorrect demixing. An additional reference explaining the use of DBSCAN including the choice of parameters is given on p. 22 (see also R2 point 4).

      For microscopy experiment methods, state power densities, not % or "nominal power".

      *Done. We now report the irradiance (laser power density) instead of nominal power (pp. 18, 21). *

      In general, not much data presented. Any SI file with extra images etc.?

      *The original submission included four supplementary figures with additional data and representative images that should have been available to the reviewer (Figs. S1-S4). The SI file has been updated during revision (new Fig. S4E-G). *

      Clarification of the discussion on GlyR expression and synaptic localization: The discussion on GlyR expression, complex formation, and synaptic localization is sometimes unclear, and needs terminological distinctions between "expression level", "complex formation" and "synaptic localization". For example, the authors state:"What then is the reason for the low protein expression of GlyRβ? One possibility is that the assembly of mature heteropentameric GlyR complexes depends critically on the expression of endogenous GlyR α subunits." Does this mean that GlyRβ proteins that fail to form complexes with GlyRα subunits are unstable and subject to rapid degradation? If so, the authors should clarify this point. The statement "This raises the interesting possibility that synaptic GlyRs may depend specifically on the concomitant expression of both α1 and β transcripts." suggests a dependency on α1 and β transcripts. However, is the authors' focus on synaptic localization or overall protein expression levels? If this means synaptic localization, it would be beneficial to state this explicitly to avoid confusion. To improve clarity, the authors should carefully distinguish between these different aspects of GlyR biology throughout the discussion. Additionally, a schematic diagram illustrating these processes would be highly beneficial for readers.

      We thank the reviewer to point this out. We are dealing with several processes; protein expression that determines subunit availability and the assembly of pentameric GlyRs complexes, surface expression, membrane diffusion and accumulation of GlyRb-containing receptor complexes at inhibitory synapses. We have edited the manuscript, particularly the discussion and tried to be as clear as possible in our wording.

      • *

      We chose not to add a schematic illustration for the time being, because any graphical representation is necessarily a simplification. Instead, we preferred to summarise the main numbers in tabular form (Table 1). We are of course open to any other suggestions.

      Interpretation of GlyR localization in the context of nanodomains. The distribution of GlyR molecules on inhibitory synapses appears to be non-homogeneous, instead forming nanoclusters or nanodomains, similar to many other synaptic proteins. It is important to interpret GlyR localization in the context of nanodomain organization.

      The dSTORM images in Fig. 2 are pointillist representations that show individual detections rather than molecules. Small clusters of detections are likely to originate from a single AF647 fluorophore (in the case of nanobody labelling) and therefore represent single GlyRb subunits. Since GlyR copy numbers are so low at hippocampal synapses (≤ 5), the notion of nanodomain is not directly applicable. Our analysis therefore focused on the integration of GlyRs within the postsynaptic scaffold, rather than attempting to define nanodomain structures (see also response to point 8 of R1). A clarification has been added in the revised manuscript (p. 6).

      __Reviewer #1 (Significance (Required)): __

      The paper presents biological and technical advances. The biological insights revolve mostly on the documentation of Glycine receptors in particular synapses in forebrain, where they are typically expressed at very low levels. The authors provide compelling data indicating that the expression is of physiological significance. The authors have done a nice job of combining genetically-tagged mice with advanced microscopy methods to tackle the question of distributions of synaptic proteins. Overall these advances are more incremental than groundbreaking.

      We thank the reviewer for acknowledging both the technical and biological advances of our study. While we recognize that our work builds upon established models, we consider that it also addresses important unresolved questions, namely that GlyRs are present and specifically anchored at inhibitory synapses in telencephalic regions, such as the hippocampus and striatum. From a methodological point of view, our study demonstrates that SMLM can be applied not only for structural analysis of highly abundant proteins, but also to reliably detect proteins present at very low copy numbers. This ability to identify and quantify sparse molecule populations adds a new dimension to SMLM applications, which we believe increases the overall impact of our study beyond the field of synaptic neuroscience.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      In their manuscript "Single molecule counting detects low-copy glycine receptors in hippocampal and striatal synapses" Camuso and colleagues apply single molecule localization microscopy (SMLM) methods to visualize low copy numbers of GlyRs at inhibitory synapses in the hippocampal formation and the striatum. SMLM analysis revealed higher copy numbers in striatum compared to hippocampal inhibitory synapses. They further provide evidence that these low copy numbers are tightly linked to post-synaptic scaffolding protein gephyrin at inhibitory synapses. Their approach profits from the high sensitivity and resolution of SMLM and challenges the controversial view on the presence of GlyRs in these formations although there are reports (electrophysiology) on the presence of GlyRs in these particular brain regions. These new datasets in the current manuscript may certainly assist in understanding the complexity of fundamental building blocks of inhibitory synapses.

      However I have some minor points that the authors may address for clarification:

      1) In Figure 1 the authors apply PALM imaging of mEos4b-GlyRß (knockin) and here the corresponding Sylite label seems to be recorded in widefield, it is not clearly stated in the figure legend if it is widefield or super-resolved. In Fig 1 A - is the scale bar 5 µm? Some Sylite spots appear to be sized around 1 µm, especially the brighter spots, but maybe this is due to the lower resolution of widefield imaging? Regarding the statistical comparison: what method was chosen to test for normality distribution, I think this point is missing in the methods section.

      *This is correct; the apparent size of the Sylite spots does not reflect the real size of the synaptic gephyrin domain due to the limited resolution of widefield imaging including the detection of out-of-focus light. We have clarified in the legend of Fig. 1A that Sylite labelling was with classic epifluorescence microscopy. The scale bar in Fig. 1A corresponds to 5 µm. Since the data were not normally distributed, nonparametric tests (Kruskal- Wallis one-way ANOVA with Dunn’s multiple comparison test or Mann-Whitney U-test for pairwise comparisons) were used (p. 23). *

      Moreover I would appreciate a clarification and/or citation that the knockin model results in no structural and physiological changes at inhibitory synapses, I believe this model has been applied in previous studies and corresponding clarification can be provided.

      The Glrbeos/eos mouse model has been described previously and does not exhibit any structural or physiological phenotypes (Maynard et al. 2021 eLife). The issue was also raised by reviewer R1 (point 5) and has been clarified in the revised manuscript (p. 4).

      2) In the next set of experiments the authors switch to demixing dSTORM experiments - an explanation why this is performed is missing in the text - I guess better resolution to perform more detailed distance measurements? For these experiments: which region of the hippocampus did the authors select, I cannot find this information in legend or main text.

      Yes, the dSTORM experiments enable dual-colour structural analysis at high spatial resolution (see response to R1 point 7). An explanation has been added (p. 6).

      3) Regarding parameters of demixing experiments: the number of frames (10.000) seems quite low and the exposure time higher than expected for Alexa 647. Can the authors explain the reason for chosing these particular parameters (low expression profile of the target - so better separation?, less fluorophores on label and shorter collection time?) or is there a reference that can be cited? The laser power is given in the methods in percentage of maximal output power, but for better comparison and reproducibility I recommend to provide the values of a power meter (kW/cm2) as lasers may change their maximum output power during their lifetime.

      Acquisition parameters (laser power, exposure time) for dSTORM were chosen to obtain a good localisation precision (~12 nm; see R1 point 8). The number of frames is adequate to obtain well sampled gephyrin scaffolds in the CF680 channel. In the case of the GlyR (nanobody-AF647), the concept of spatial resolution does not really apply due to the low number of targets (see R1, point 13). Power density (irradiance) values have now been given (pp. 18, 21).

      4) For analysis of subsynaptic distribution: how did the authors decide to choose the parameters in the NEO software for DBSCAN clustering - was a series of parameters tested to find optimal conditions and did the analysis start with an initial test if data is indeed clustered (K-ripley) or is there a reference in literature that can be provided?

      DBSCAN parameters were optimised manually, by testing different values. Identification of dense and well-delimited gephyrin clusters (CF680) was achieved with a small radius and a high number of detections (80 nm, ≥ 50 neighbours), whereas filtering of low-density background in the AF647 channel (GlyRs) required less stringent parameters (200 nm, ≥ 5) due to the low number of target molecules. Similar parameters were used in a previous publication (Khayenko et al. 2022, Angewandte Chemie). The reference has been provided on p. 22 (see also R1 point 9).

      5) A conclusion/discussion of the results presented in Figure 5 is missing in the text/discussion.

      *This part of the manuscript has been completely overhauled. It includes new experimental data, quantification of the data (new Fig.5), as well as the discussion and interpretation of our findings (see also R1, point 3). In agreement with our earlier interpretation, the data confirm that low availability of GlyRa1 subunits limits the expression and synaptic targeting of GlyRa1/b heteropentamers. The observation that GlyRa1 overexpression with lentivirus increases the size of the postsynaptic gephyrin domain further points to a structural role, whereby GlyRs can enhance the stability (and size) of inhibitory synapses in hippocampal neurons, even at low copy numbers (pp. 13-14). *

      6) in line 552 "suspension" is misleading, better use "solution"

      Done.

      __Reviewer #2 (Significance (Required)): __

      Significance: The manuscript provides new insights to presence of low-copy numbers by visualizing them via SMLM. This is the first report that visualizes GlyR optically in the brain applying the knock-in model of mEOS4b tagged GlyRß and quantifies their copy number comparing distribution and amount of GlyRs from hippocampus and striatum. Imaging data correspond well to electrophysiological measurements in the manuscript.

      Field of expertise: Super-Resolution Imaging and corresponding analysis

      __Reviewer #4 (Evidence, reproducibility and clarity (Required)): __

      In this study, Camuso et al., make use of a knock-in mouse model expressing endogenously mEos4b-tagged GlyRβ to detect endogenous glycine receptors using single-molecule localization microscopy. The main conclusion from this study is that in the hippocampus GlyRβ molecules are barely detected, while inhibitory synapses in the ventral striatum seem to express functionally relevant GlyR numbers.

      I have a few points that I hope help to improve the strength of this study.

      • In the hippocampus, this study finds that the numbers of detections are very low. The authors perform adequate controls to indicate that these localizations are above noise level. Nevertheless, it remains questionable that these reflect proper GlyRs. The suggestion that in hippocampal synapses the low numbers of GlyRβ molecules "are important in assembly or maintenance of inhibitory synaptic structures in the brain" is on itself interesting, but is not at all supported. It is also difficult to envision how such low numbers could support the structure of a synapse. A functional experiment showing that knockdown of GlyRs affects inhibitory synapse structure in hippocampal neurons would be a minimal test of this.

      *It is not clear what the reviewer means by “it remains questionable that these reflect proper GlyRs”. The PALM experiments include a series of stringent controls (see R1, point 1) demonstrating the existence of low-copy GlyRs at inhibitory synapses in the hippocampus (Fig. 1) and in the striatum (Fig. 3), and are backed up by dSTORM experiments (Fig. 2). We have no reason to doubt that these receptors are fully functional (as demonstrated for the ventral striatum (Fig. 4). However, due to their low number, a role in inhibitory synaptic transmission is clearly limited, at least in the hippocampus and dorsal striatum. *

      • *

      We therefore propose a structural role, where the GlyRs could be required to stabilise the postsynaptic gephyrin domain in hippocampal neurons. This is based on the idea that the GlyR-gephyrin affinity is much higher than that of the GABAAR-gephyrin interaction (reviewed in Kasaragod & Schindelin 2018 Front Mol Neurosci). Accordingly, there is a close relationship between GlyRs and gephyrin numbers, sub-synaptic distribution, and dynamics in spinal cord synapses that are mostly glycinergic (Specht et al. 2013 Neuron; Maynard et al. 2021 eLife; Chapdelaine et al. 2021 Biophys J). It is reasonable to assume that low-copy GlyRs could play a similar structural role at hippocampal synapses. A knockdown experiment targeting these few receptors is technically very challenging and beyond the scope of this study. However, in response to the reviewer's question we have conducted new experiments in cultured hippocampal neurons (new__ Fig. 5__). They demonstrate that overexpression of GlyRa1/b heteropentamers increases the size of the postsynaptic domain in these neurons, supporting our interpretation of a structural role of low-copy GlyRs (p. 14).

      • The endogenous tagging strategy is a very strong aspect of this study and provides confidence in the labeling of GlyRβ molecules. One caveat however, is that this labeling strategy does not discriminate whether GlyRβ molecules are on the cell membrane or in internal compartments. Can the authors provide an estimate of the ratio of surface to internal GlyRβ molecules?

      Gephyrin is known to form a two-dimensional scaffold below the synaptic membrane to which inhibitory GlyRs and GABAARs attach (reviewed in Alvarez 2017 Brain Res). The majority of the synaptic receptors are therefore thought to be located in the synaptic membrane, which is supported by the close relationship between the sub-synaptic distribution of GlyRs and gephyrin in spinal cord neurons (e.g. Maynard et al. 2021 eLife). To demonstrate the surface expression of GlyRs at hippocampal synapses we labelled cultured hippocampal neurons expressing mEos4b-GlyRa1 with anti-Eos nanobody in non-permeabilised neurons (see Figure below for the reviewer only). The close correspondence between the nanobody (AF647) and the mEos4b signal confirms that the majority of the GlyRs are indeed located in the synaptic membrane.

      • *

      Figure (for the reviewer only).* Left: Lentivirus expression of mEos4b-GlyRa1 in fixed and non-permeabilised hippocampal neurons (mEos4b signal). Right: Surface labelling of the recombinant subunit with anti-Eos nanoboby (AF647). *

      • 'We also estimated the absolute number of GlyRs per synapse in the hippocampus. The number of mEos4b detections was converted into copy numbers by dividing the detections at synapses by the average number of detections of individual mEos4b-GlyRβ containing receptor complexes'. In essence this is a correct method to estimate copy numbers, and the authors discuss some of the pitfalls associated with this approach (i.e., maturation of fluorophore and detection limit). Nevertheless, the authors did not subtract the number of background localizations determined in the two negative control groups. This is critical, particularly at these low-number estimations.

      We fully agree that background subtraction can be useful with low detection numbers. In the revised manuscript, copy numbers are now reported as background-corrected values. Specifically, the mean number of detections measured in wildtype slices was used to calculate an equivalent receptor number, which was then subtracted from the copy number estimates across hippocampus, spinal cord and striatum. This procedure is described in the methods (p. 20) and results (p. 5, 8), and mentioned in the figure legends of Fig. 1C, 3C. The background corrected values are given in the text and Table 1.

      Furthermore, the authors state that "The advantage of this estimation is that it is independent of the stoichiometry of heteropentameric GlyRs". However, if the stoichometry is unknown, the number of counted GlyRβ subunits cannot simply be reported as the number of GlyRs. This should be discussed in more detail, and more carefully reported throughout the manuscript.

      *The reviewer is right to point this out. There is still some debate about the stoichiometry of heteropentameric GlyRs. Configurations with 2a:3b, 3a:2b and 4a:1b subunits have been advanced (e.g. Grudzinska et al. 2005 Neuron; Durisic et al. 2012 J Neurosci; Patrizio et al. 2017 Sci Rep; Zhu & Gouaux 2021 Nature). We have therefore chosen a quantification that is independent of the underlying stoichiometry. Since our quantification is based on very sparse clusters of mEos4b detections that likely originate from a single receptor complex (irrespective of its stoichiometry), the reported values actually reflect the number of GlyRs (and not GlyRb subunits). We have clarified this in the results (p. 5) and throughout the manuscript (Table 1). *

      • The dual-color imaging provides insights in the subsynaptic distribution of GlyRβ molecules in hippocampal synapses. Why are similar studies not performed on synapses in the ventral striatum where functionally relevant numbers of GlyRβ molecules are found? Here insights in the subsynaptic receptor distribution would be of much more interest as it can be tight to the function.

      This is an interesting suggestion. However, the primary aim of our study was to identify the existence of GlyRs in hippocampal regions. At low copy numbers, the concept of sub-synaptic domains (SSDs, e.g. Yang et al. 2021 EMBO Rep) becomes irrelevant (see R1 point 13). It should be pointed out that the dSTORM pointillist images (Fig. 2A) represent individual GlyR detections rather than clusters of molecules. In the striatum, our specific purpose was to solve an open question about the presence of GlyRs in different subregions (putamen, nucleus accumbens).

      • It is unclear how the experiments in Figure 5 add to this study. These results are valid, but do not seem to directly test the hypothesis that "the expression of α subunits may be limiting factor controlling the number of synaptic GlyRs". These experiments simply test if overexpressed α subunits can be detected. If the α subunits are limiting, measuring the effect of α subunit overexpression on GlyRβ surface expression would be a more direct test.

      Both R1 and R2 have also commented on the data in Fig. 5 and their interpretation. We have substantially revised this section as described before (see R1 point 3) including additional experiments and quantification of the data (new Fig. 5). The findings lend support to our earlier hypothesis that GlyR alpha subunits (in particular GlyRa1) are the limiting factor for the expression of heteropentameric GlyRa/b in hippocampal neurons (pp. 13-14). Since the GlyRa1 subunit itself does not bind to gephyrin (Patrizio et al. 2017 Sci Rep), the synaptic localisation of the recombinant mEos4b-GlyRa1 subunits is proof that they have formed heteropentamers with endogenous GlyRb subunits and driven their membrane trafficking, which the GlyRb subunits are incapable of doing on their own.

      __Reviewer #4 (Significance (Required)): __

      These results are based on carefully performed single-molecule localization experiments, and are well-presented and described. The knockin mouse with endogenously tagged GlyRβ molecules is a very strong aspect of this study and provides confidence in the labeling, the combination with single-molecule localization microscopy is very strong as it provides high sensitivity and spatial resolution.

      The conceptual innovation however seems relatively modest, these results confirm previous studies but do not seem to add novel insights. This study is entirely descriptive and does not bring new mechanistic insights.

      This study could be of interest to a specialized audience interested in glycine receptor biology, inhibitory synapse biology and super-resolution microscopy.

      my expertise is in super-resolution microscopy, synaptic transmission and plasticity

      As we have stated before, the novelty of our study lies in the use of SMLM for the identification of very small numbers of molecules, which requires careful control experiments. This is something that has not been done before and that can be of interest to a wider readership, as it opens up SMLM for ultrasensitive detection of rare molecular events. Using this approach, we solve two open scientific questions: (1) the demonstration that low-copy GlyRs are present at inhibitory synapses in the hippocampus, (2) the sub-region specific expression and functional role of GlyRs in the ventral versus dorsal striatum.

      • *

      • *

      The following review was provided later under the name “Reviewer #4”. To avoid confusion with the last reviewer from above we will refer to this review as R4-2.


      __Reviewer #4-2 (Evidence, reproducibility and clarity (Required)): __


      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The authors investigate the presence of synaptic glycine receptors in the telencephalon, whose presence and function is poorly understood.

      Using a transgenically labeled glycine receptor beta subunit (Glrb-mEos4b) mouse model together with super-resolution microscopy (SLMM, dSTORM), they demonstrate the presence of a low but detectable amount of synaptically localized GLRB in the hippocampus. While they do not perform a functional analysis of these receptors, they do demonstrate that these subunits are integrated into the inhibitory postsynaptic density (iPSD) as labeled by the scaffold protein gephyrin. These findings demonstrate that a low level of synaptically localized glycerine receptor subunits exist in the hippocampal formation, although whether or not they have a functional relevance remains unknown.

      They then proceed to quantify synaptic glycine receptors in the striatum, demonstrating that the ventral striatum has a significantly higher amount of GLRB co-localized with gephyrin than the dorsal striatum or the hippocampus. They then recorded pharmacologically isolated glycinergic miniature inhibitory postsynaptic currents (mIPSCs) from striatal neurons. In line with their structural observations, these recordings confirmed the presence of synaptic glycinergic signaling in the ventral striatum, and an almost complete absence in the dorsal striatum. Together, these findings demonstrate that synaptic glycine receptors in the ventral striatum are present and functional, while an important contribution to dorsal striatal activity is less likely.

      Lastly, the authors use existing mRNA and protein datasets to show that the expression level of GLRA1 across the brain positively correlates with the presence of synaptic GLRB.

      The authors use lentiviral expression of mEos4b-tagged glycine receptor alpha1, alpha2, and beta subunits (GLRA1, GLRA1, GLRB) in cultured hippocampal neurons to investigate the ability of these subunits to cause the synaptic localization of glycine receptors. They suggest that the alpha1 subunit has a higher propensity to localize at the inhibitory postsynapse (labeled via gephyrin) than the alpha2 or beta subunits, and may therefore contribute to the distribution of functional synaptic glycine receptors across the brain.

      Major comments:

      • Are the key conclusions convincing?

      The authors are generally precise in the formulation of their conclusions.

      • They demonstrate a very low, but detectable, amount of a synaptically localized glycine receptor subunit in a transgenic (GlrB-mEos4b) mouse model. They demonstrate that the GLRB-mEos4b fusion protein is integrated into the iPSD as determined by gephyrin labelling. The authors do not perform functional tests of these receptors and do not state any such conclusions.
      • The authors show that GLRB-mEos4b is clearly detectable in the striatum and integrated into gephyrin clusters at a significantly higher rate in the ventral striatum compared to the dorsal striatum, which is in line with previous studies.
      • Adding to their quantification of GLRB-mEos4b in the striatum, the authors demonstrate the presence of glycinergic miniature IPSCs in the ventral striatum, and an almost complete absence of mIPSCs in the dorsal striatum. These currents support the observation that GLRB-mEos4b is more synaptically integrated in the ventral striatum compared to the dorsal striatum.
      • The authors show that lentiviral expression of GLRA1-mEos4b leads to a visually higher number of GLR clusters in cultured hippocampal neurons, and a co-localization of some clusters with gephyrin. The authors claim that this supports the idea that GLRA1 may be an important driver of synaptic glycine receptor localization. However, no quantification or statistical analysis of the number of puncta or their colocalization with gephyrin is provided for any of the expressed subunits. Such a claim should be supported by quantification and statistics A thorough analysis and quantification of the data in Fig.5 has been carried out as requested by all the other reviewers (e.g. R1, point 3). The new data and results have been described in the revised manuscript (pp. 9-10, 13-14).

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      One unaddressed caveat is the fact that a GLRB-mEos4b fusion protein may behave differently in terms of localization and synaptic integration than wild-type GLRB. While unlikely, it is possible that mEos4b interacts either with itself or synaptic proteins in a way that changes the fused GLRB subunit’s localization. Such an effect would be unlikely to affect synaptic function in a measurable way, but might be detected at a structural level by highly sensitive methods such as SMLM and STORM in regions with very low molecule numbers (such as the hippocampus). Since reliable antibodies against GLRB in brain tissue sections are not available, this would be difficult to test. Considering that no functional measures of the hippocampal detections exist, we would suggest that this possible caveat be mentioned for this particular experiment.

      *This question has also been raised before (R1, point 5). According to an earlier study the mEos4b-GlyRb knock-in does not cause any obvious phenotypes, with the possible exception of minor loss of glycine potency (Maynard et al. 2021 eLife). The fact that the synaptic levels in the spinal cord in heterozygous animals are precisely half of those of homozygous animals argues against differences in receptor expression, heteropentameric assembly, forward trafficking to the plasma membrane and integration into the synaptic membrane as confirmed using quantitative super-resolution CLEM (Maynard et al. 2021 eLife). Accordingly, we did not observe any behavioural deficits in these animals, making it a powerful experimental model. We have added this information in the revised manuscript (p. 4). *

      In addition, without any quantification or statistical analysis, the author’s claims regarding the necessity of GLRA1 expression for the synaptic localization of glycine receptors in cultured hippocampal neurons should probably be described as preliminary (Fig. 5).

      As mentioned before, we have substantially revised this part (R1, point 3). The quantification and analysis in the new Fig. 5 support our earlier interpretation.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      The authors show that there is colocalization of gephyrin with the mEos4b-GlyRβ subunit using the Dual-colour SMLM. This is a powerful approach that allows for a claim to be made on the synaptic location of the glycine receptors. The images presented in Figure 1, together with the distance analysis in Figure 2, display the co-localization of the fluorophores. The co-localization images in all the selected regions, hippocampus and striatum, also show detections outside of the gephyrin clusters, which the authors refer to as extrasynaptic. These punctated small clusters seem to have the same size as the ones detected and assigned as part of the synapse. It would be informative if the authors analysed the distribution, density and size of these non-synaptic clusters and presented the data in the manuscript and also compared it against the synaptic ones. Validating this extrasynaptic signal by staining for a dendritic marker, such as MAP-2 or maybe a somatic marker and assessing the co-localization with the non-synaptic clusters would also add even more credibility to them being extrasynaptic.

      The existence of extrasynaptic GlyRs is well attested in spinal cord neurons (e.g. Specht et al. 2013 Neuron; this study see Fig. S2). The fact that these appear as small clusters of detections in SMLM recordings results from the fact that a single fluorophore can be detected several times in consecutive image frames and because of blinking. Therefore, small clusters of detections likely represent single GlyRs (that can be counted), and not assemblies of several receptor complexes. Due to their diffusion in the neuronal membrane, they are seen as diffuse signals throughout the somatodendritic compartment in epifluorescence images (e.g. Fig. 5A). SMLM recordings of the same cells resolves this diffuse signal into discrete nanoclusters representing individual receptors (Fig. 5B). It is not clear what information co-localisation experiments with specific markers could provide, especially in hippocampal neurons, in which the copy numbers (and density) of GlyRs is next to zero.

      In addition we would encourage the authors to quantify the clustering and co-localization of virally expressed GLRA1, GLRA2, and GLRB with gephyrin in order to support the associated claims (Fig. 5). Preferably, the density of GLR and gephyrin clusters (at least on the somatic surface, the proximal dendrites, or both) as well as their co-localization probability should be quantified if a causal claim about subunit-specific requirements for synaptic localization is to be made.

      Quantification of the data have been carried out (new Fig.5C,D). The results have been described before (R1, point 3) and support our earlier interpretation of the data (pp. 13-14).

      Lastly, even though it may be outside of the scope of such a study analysing other parts of the hippocampal area could provide additional important information. If one looks at the Allen Institute’s ISH of the beta subunit the strongest signal comes from the stratum oriens in the CA1 for example, suggesting that interneurons residing there would more likely have a higher expression of the glycine receptors. This could also be assessed by looking more carefully at the single cell transcriptomics, to see which cell types in the hippocampus show the highest mRNA levels. If the authors think that this is too much additional work, then perhaps a mention of this in the discussion would be good.

      We have added the requested information from the ISH database of the Allen Institute in the discussion as suggested by the reviewer (p. 12). However, in combination with the transcriptomic data (Fig. S1) our finding strongly suggest that the expression of synaptic GlyRs depends on the availability of alpha subunits rather than on the presence of the GlyRb transcript. This is obvious when one compares the mRNA levels in the hippocampus with those in the basal ganglia (striatum) and medulla. While the transcript concentrations of GlyRb are elevated in all three regions and essentially the same, our data show that the GlyRb copy numbers *at synapses differ over more than 2 orders of magnitude (Fig. 1B, Table 1). *

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Since the labeling and some imaging has been performed already, the requested experiment would be a matter of deploying a method of quantification. In principle, it should not require any additional wet-lab experiments, although it may require additional imaging of existing samples.

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes, for the most part.

      • Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments:

      • Specific experimental issues that are easily addressable.

      N/A

      • Are prior studies referenced appropriately?

      Yes

      • Are the text and figures clear and accurate?

      Yes, although quantification in figure 5 is currently not present.

      A quantification has been added (see R1, point 3).

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      This paper presents a method that could be used to localize receptors and perhaps other proteins that are in low abundance or for which a detailed quantification is necessary. I would therefore suggest that Figure S4 is included into Figure 2 as the first panel, showcasing the demixing, followed by the results.

      We agree in principle with this suggestion. However, the revised Fig. S4 is more complex and we think that it would distract from the data shown in Fig. 2. Given that Fig. S4 is mostly methodological and not essential to understand the text, we have kept it in the supplement for the time being. We leave the final decision on this point to the editor.

      __Reviewer #4-2 (Significance (Required)): __

      [This review was supplied later]

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Using a novel and high resolution method, the authors have provided strong evidence for the presence of glycine receptors in the murine hippocampus and in the dorsal striatum. The number of receptors calculated is small compared to the numbers found in the ventral striatum. This is the first study to quantify receptor numbers in these region. In addition it also lays a roadmap for future studies addressing similar questions.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      This is done well by the authors in the curation of the literature. As stated above, the authors have filled a gap in the presence of glycine receptors in different brain regions, a subject of importance in understanding the role they play in brain activity and function.

      • State what audience might be interested in and influenced by the reported findings.

      Neuroscientists working at the synaptic level, on inhibitory neurotransmission and on fundamental mechanisms of expression of genes at low levels and their relationship to the presence of the protein would be interested. Furthermore, researchers in neuroscience and cell biology may benefit from and be inspired by the approach used in this manuscript, to potentially apply it to address their own aims.

      *We thank the reviewer for the positive assessment of the technical and biological implications of our work, as well as the interest of our findings to a wide readership of neuroscientists and cell biologists. *

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Synaptic transmission, inhibitory cells and GABAergic synapses functionally and structurally, cortex and cortical circuits. No strong expertise in super-resolution imaging methods.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary: Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The authors investigate the presence of synaptic glycine receptors in the telencephalon, whose presence and function is poorly understood.

      Using a transgenically labeled glycine receptor beta subunit (Glrb-mEos4b) mouse model together with super-resolution microscopy (SLMM, dSTORM), they demonstrate the presence of a low but detectable amount of synaptically localized GLRB in the hippocampus. While they do not perform a functional analysis of these receptors, they do demonstrate that these subunits are integrated into the inhibitory postsynaptic density (iPSD) as labeled by the scaffold protein gephyrin. These findings demonstrate that a low level of synaptically localized glycerine receptor subunits exist in the hippocampal formation, although whether or not they have a functional relevance remains unknown.

      They then proceed to quantify synaptic glycine receptors in the striatum, demonstrating that the ventral striatum has a significantly higher amount of GLRB co-localized with gephyrin than the dorsal striatum or the hippocampus. They then recorded pharmacologically isolated glycinergic miniature inhibitory postsynaptic currents (mIPSCs) from striatal neurons. In line with their structural observations, these recordings confirmed the presence of synaptic glycinergic signaling in the ventral striatum, and an almost complete absence in the dorsal striatum. Together, these findings demonstrate that synaptic glycine receptors in the ventral striatum are present and functional, while an important contribution to dorsal striatal activity is less likely.

      Lastly, the authors use existing mRNA and protein datasets to show that the expression level of GLRA1 across the brain positively correlates with the presence of synaptic GLRB. The authors use lentiviral expression of mEos4b-tagged glycine receptor alpha1, alpha2, and beta subunits (GLRA1, GLRA1, GLRB) in cultured hippocampal neurons to investigate the ability of these subunits to cause the synaptic localization of glycine receptors. They suggest that the alpha1 subunit has a higher propensity to localize at the inhibitory postsynapse (labeled via gephyrin) than the alpha2 or beta subunits, and may therefore contribute to the distribution of functional synaptic glycine receptors across the brain.

      Major comments: - Are the key conclusions convincing?

      The authors are generally precise in the formulation of their conclusions.

      1) They demonstrate a very low, but detectable, amount of a synaptically localized glycine receptor subunit in a transgenic (GlrB-mEos4b) mouse model. They demonstrate that the GLRB-mEos4b fusion protein is integrated into the iPSD as determined by gephyrin labelling. The authors do not perform functional tests of these receptors and do not state any such conclusions. 2) The authors show that GLRB-mEos4b is clearly detectable in the striatum and integrated into gephyrin clusters at a significantly higher rate in the ventral striatum compared to the dorsal striatum, which is in line with previous studies. 3) Adding to their quantification of GLRB-mEos4b in the striatum, the authors demonstrate the presence of glycinergic miniature IPSCs in the ventral striatum, and an almost complete absence of mIPSCs in the dorsal striatum. These currents support the observation that GLRB-mEos4b is more synaptically integrated in the ventral striatum compared to the dorsal striatum. 4) The authors show that lentiviral expression of GLRA1-mEos4b leads to a visually higher number of GLR clusters in cultured hippocampal neurons, and a co-localization of some clusters with gephyrin. The authors claim that this supports the idea that GLRA1 may be an important driver of synaptic glycine receptor localization. However, no quantification or statistical analysis of the number of puncta or their colocalization with gephyrin is provided for any of the expressed subunits. Such a claim should be supported by quantification and statistics

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      One unaddressed caveat is the fact that a GLRB-mEos4b fusion protein may behave differently in terms of localization and synaptic integration than wild-type GLRB. While unlikely, it is possible that mEos4b interacts either with itself or synaptic proteins in a way that changes the fused GLRB subunit's localization. Such an effect would be unlikely to affect synaptic function in a measurable way, but might be detected at a structural level by highly sensitive methods such as SMLM and STORM in regions with very low molecule numbers (such as the hippocampus). Since reliable antibodies against GLRB in brain tissue sections are not available, this would be difficult to test. Considering that no functional measures of the hippocampal detections exist, we would suggest that this possible caveat be mentioned for this particular experiment.

      In addition, without any quantification or statistical analysis, the author's claims regarding the necessity of GLRA1 expression for the synaptic localization of glycine receptors in cultured hippocampal neurons should probably be described as preliminary (Fig. 5).

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      The authors show that there is colocalization of gephyrin with the mEos4b-GlyRβ subunit using the Dual-colour SMLM. This is a powerful approach that allows for a claim to be made on the synaptic location of the glycine receptors. The images presented in Figure 1, together with the distance analysis in Figure 2, display the co-localization of the fluorophores. The co-localization images in all the selected regions, hippocampus and striatum, also show detections outside of the gephyrin clusters, which the authors refer to as extrasynaptic. These punctated small clusters seem to have the same size as the ones detected and assigned as part of the synapse. It would be informative if the authors analysed the distribution, density and size of these non-synaptic clusters and presented the data in the manuscript and also compared it against the synaptic ones. Validating this extrasynaptic signal by staining for a dendritic marker, such as MAP-2 or maybe a somatic marker and assessing the co-localization with the non-synaptic clusters would also add even more credibility to them being extrasynaptic.

      In addition we would encourage the authors to quantify the clustering and co-localization of virally expressed GLRA1, GLRA2, and GLRB with gephyrin in order to support the associated claims (Fig. 5). Preferably, the density of GLR and gephyrin clusters (at least on the somatic surface, the proximal dendrites, or both) as well as their co-localization probability should be quantified if a causal claim about subunit-specific requirements for synaptic localization is to be made.

      Lastly, even though it may be outside of the scope of such a study analysing other parts of the hippocampal area could provide additional important information. If one looks at the Allen Institute's ISH of the beta subunit the strongest signal comes from the stratum oriens in the CA1 for example, suggesting that interneurons residing there would more likely have a higher expression of the glycine receptors. This could also be assessed by looking more carefully at the single cell transcriptomics, to see which cell types in the hippocampus show the highest mRNA levels. If the authors think that this is too much additional work, then perhaps a mention of this in the discussion would be good.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Since the labeling and some imaging has been performed already, the requested experiment would be a matter of deploying a method of quantification. In principle, it should not require any additional wet-lab experiments, although it may require additional imaging of existing samples.

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes, for the most part.

      • Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments: - Specific experimental issues that are easily addressable.

      N/A

      • Are prior studies referenced appropriately?

      Yes

      • Are the text and figures clear and accurate?

      Yes, although quantification in figure 5 is currently not present.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      This paper presents a method that could be used to localize receptors and perhaps other proteins that are in low abundance or for which a detailed quantification is necessary. I would therefore suggest that Figure S4 is included into Figure 2 as the first panel, showcasing the demixing, followed by the results.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Using a novel and high resolution method, the authors have provided strong evidence for the presence of glycine receptors in the murine hippocampus and in the dorsal striatum. The number of receptors calculated is small compared to the numbers found in the ventral striatum. This is the first study to quantify receptor numbers in these region. In addition it also lays a roadmap for future studies addressing similar questions.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      This is done well by the authors in the curation of the literature. As stated above, the authors have filled a gap in the presence of glycine receptors in different brain regions, a subject of importance in understanding the role they play in brain activity and function.

      • State what audience might be interested in and influenced by the reported findings.

      Neuroscientists working at the synaptic level, on inhibitory neurotransmission and on fundamental mechanisms of expression of genes at low levels and their relationship to the presence of the protein would be interested. Furthermore, researchers in neuroscience and cell biology may benefit from and be inspired by the approach used in this manuscript, to potentially apply it to address their own aims.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Synaptic transmission, inhibitory cells and GABAergic synapses functionally and structurally, cortex and cortical circuits. No strong expertise in super-resolution imaging methods.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their manuscript "Single molecule counting detects low-copy glycine receptors in hippocampal and striatal synapses" Camuso and colleagues apply single molecule localization microscopy (SMLM) methods to visualize low copy numbers of GlyRs at inhibitory synapses in the hippocampal formation and the striatum. SMLM analysis revealed higher copy numbers in striatum compared to hippocampal inhibitory synapses. They further provide evidence that these low copy numbers are tightly linked to post-synaptic scaffolding protein gephyrin at inhibitory synapses. Their approach profits from the high sensitivity and resolution of SMLM and challenges the controversial view on the presence of GlyRs in these formations although there are reports (electrophysiology) on the presence of GlyRs in these particular brain regions. These new datasets in the current manuscript may certainly assist in understanding the complexity of fundamental building blocks of inhibitory synapses.

      However I have some minor points that the authors may address for clarification:

      1. In Figure 1 the authors apply PALM imaging of mEos4b-GlyRß (knockin) and here the corresponding Sylite label seems to be recorded in widefield, it is not clearly stated in the figure legend if it is widefield or super-resolved. In Fig 1 A - is the scale bar 5 µm? Some Sylite spots appear to be sized around 1 µm, especially the brighter spots, but maybe this is due to the lower resolution of widefield imaging? Regarding the statistical comparison: what method was chosen to test for normality distribution, I think this point is missing in the methods section. Moreover I would appreciate a clarification and/or citation that the knockin model results in no structural and physiological changes at inhibitory synapses, I believe this model has been applied in previous studies and corresponding clarification can be provided.
      2. In the next set of experiments the authors switch to demixing dSTORM experiments - an explanation why this is performed is missing in the text - I guess better resolution to perform more detailed distance measurements? For these experiments: which region of the hippocampus did the authors select, I cannot find this information in legend or main text.
      3. Regarding parameters of demixing experiments: the number of frames (10.000) seems quite low and the exposure time higher than expected for Alexa 647. Can the authors explain the reason for chosing these particular parameters (low expression profile of the target - so better separation?, less fluorophores on label and shorter collection time?) or is there a reference that can be cited? The laser power is given in the methods in percentage of maximal output power, but for better comparison and reproducibility I recommend to provide the values of a power meter (kW/cm2) as lasers may change their maximum output power during their lifetime.
      4. For analysis of subsynaptic distribution: how did the authors decide to choose the parameters in the NEO software for DBSCAN clustering - was a series of parameters tested to find optimal conditions and did the analysis start with an initial test if data is indeed clustered (K-ripley) or is there a reference in literature that can be provided?
      5. A conclusion/discussion of the results presented in Figure 5 is missing in the text/discussion.
      6. in line 552 "suspension" is misleading, better use "solution"

      Significance

      Significance: The manuscript provides new insights to presence of low-copy numbers by visualizing them via SMLM. This is the first report that visualizes GlyR optically in the brain applying the knock-in model of mEOS4b tagged GlyRß and quantifies their copy number comparing distribution and amount of GlyRs from hippocampus and striatum. Imaging data correspond well to electrophysiological measurements in the manuscript.

      Field of expertise: Super-Resolution Imaging and corresponding analysis

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors investigate the nanoscopic distribution of glycine receptor subunits in the hippocampus, dorsal striatum, and ventral striatum of the mouse brain using single-molecule localization microscopy (SMLM). They demonstrate that only a small number of glycine receptors are localized at hippocampal inhibitory synapses. Using dual-color SMLM, they further show that clusters of glycine receptors are predominantly localized within gephyrin-positive synapses. A comparison between the dorsal and ventral striatum reveals that the ventral striatum contains approximately eight times more glycine receptors and this finding is consistent with electrophysiological data on postsynaptic inhibitory currents. Finally, using cultured hippocampal neurons, they examine the differential synaptic localization of glycine receptor subunits (α1, α2, and β). This study is significant as it provides insights into the nanoscopic localization patterns of glycine receptors in brain regions where this protein is expressed at low levels. Additionally, the study demonstrates the different localization patterns of GlyR in distinct striatal regions and its physiological relevance using SMLM and electrophysiological experiments. However, several concerns should be addressed.

      The following are specific comments:

      1. Colocalization analysis in Figure 1A. The colocalization between Sylite and mEos-GlyRβ appears to be quite low. It is essential to assess whether the observed colocalization is not due to random overlap. The authors should consider quantifying colocalization using statistical methods, such as a pixel shift analysis, to determine whether colocalization frequencies remain similar after artificially displacing one of the channels.
      2. Inconsistency between Figure 3A and 3B. While Figure 3B indicates an ~8-fold difference in the number of mEos4b-GlyRβ detections per synapse between the dorsal and ventral striatum, Figure 3A does not appear to show a pronounced difference in the localization of mEos4b-GlyRβ on Sylite puncta between these two regions. If the images presented in Figure 3A are not representative, the authors should consider replacing them with more representative examples or providing an expanded images with multiple representative examples. Alternatively, if this inconsistency can be explained by differences in spot density within clusters, the authors should explain that.
      3. Quantification in Figure 5. It is recommended that the authors provide quantitative data on cluster formation and colocalization with Sylite puncta in Figure 5 to support their qualitative observations.
      4. Potential for pseudo replication. It's not clear whether they're performing stats tests across biological replica, images, or even synapses. They often quote mean +/- SEM with n = 1000s, and so does that mean they're doing tests on those 1000s? Need to clarify.
      5. Does mEoS effect expression levels or function of the protein? Can't see any experiments done to confirm this. Could suggest WB on homogenate, or mass spec?
      6. Quantification of protein numbers is challenging with SMLM. Issues include i) some of FP not correctly folded/mature, and ii) dependence of localisation rate on instrument, excitation/illumination intensities, and also the thresholds used in analysis. Can the authors compare with another protein that has known expression levels- e.g. PSD95? This is quite an ask, but if they could show copy number of something known to compare with, it would be useful.
      7. Rationale for doing nanobody dSTORM not clear at all. They don't explain the reason for doing the dSTORM experiments. Why not just rely on PALM for coincidence measurements, rather than tagging mEoS with a nanobody, and then doing dSTORM with that? Can they explain? Is it to get extra localisations- i.e. multiple per nanobody? If so, localising same FP multiple times wouldn't improve resolution. Also, no controls for nanobody dSTORM experiments- what about non-spec nb, or use on WT sections?
      8. What resolutions/precisions were obtained in SMLM experiments? Should perform Fourier Ring Correlation (FRC) on SR images to state resolutions obtained (particularly useful for when they're presenting distance histograms, as this will be dependent on resolution). Likewise for precision, what was mean precision? Can they show histograms of localisation precision.
      9. Why were DBSCAN parameters selected? How can they rule out multiple localisations per fluor? If low copy numbers (<10), then why bother with DBSCAN? Could just measure distance to each one.
      10. For microscopy experiment methods, state power densities, not % or "nominal power".
      11. In general, not much data presented. Any SI file with extra images etc.?
      12. Clarification of the discussion on GlyR expression and synaptic localization: The discussion on GlyR expression, complex formation, and synaptic localization is sometimes unclear, and needs terminological distinctions between "expression level", "complex formation" and "synaptic localization". For example, the authors state:"What then is the reason for the low protein expression of GlyRβ? One possibility is that the assembly of mature heteropentameric GlyR complexes depends critically on the expression of endogenous GlyR α subunits." Does this mean that GlyRβ proteins that fail to form complexes with GlyRα subunits are unstable and subject to rapid degradation? If so, the authors should clarify this point. The statement "This raises the interesting possibility that synaptic GlyRs may depend specifically on the concomitant expression of both α1 and β transcripts." suggests a dependency on α1 and β transcripts. However, is the authors' focus on synaptic localization or overall protein expression levels? If this means synaptic localization, it would be beneficial to state this explicitly to avoid confusion. To improve clarity, the authors should carefully distinguish between these different aspects of GlyR biology throughout the discussion. Additionally, a schematic diagram illustrating these processes would be highly beneficial for readers.
      13. Interpretation of GlyR localization in the context of nanodomains. The distribution of GlyR molecules on inhibitory synapses appears to be non-homogeneous, instead forming nanoclusters or nanodomains, similar to many other synaptic proteins. It is important to interpret GlyR localization in the context of nanodomain organization.

      Significance

      The paper presents biological and technical advances. The biological insights revolve mostly on the documentation of Glycine receptors in particular synapses in forebrain, where they are typically expressed at very low levels. The authors provide compelling data indicating that the expression is of physiological significance. The authors have done a nice job of combining genetically-tagged mice with advanced microscopy methods to tackle the question of distributions of synaptic proteins. Overall these advances are more incremental than groundbreaking.

  3. freelanceastrophysicist.com freelanceastrophysicist.com
    1. for - book - More Everything Forever - Adam Becker - from - Youtube - Essentia Foundation - interview - Alex Gomez-Marin - Neuroscientist speaks out on the hidden war on consciousness - https://hyp.is/ile8TIvJEfCl35MW3f5B8Q/www.youtube.com/watch?v=K7NIicE_h9w

      Summary - Interesting adjacency with another video I've been watching, that focused on a Western monk's practice of Tibetan Buddhism, who after 12 years, entered a 4 year retreat and panicked - His demons emerged in the first 2 years of the retreat and he left but returned - This monk emphasized accepting the relationship with his demons instead of averting them and how craving and desire emphasized by Western civilllization is the cause of modernity's meaning crisis - to - Youtube - Diary of a CEO - Your brain is lying to you - Interview - Gerong Tupton - https://hyp.is/go?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DvIbLQQ1i56Y&group=world

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03111

      Corresponding author(s): Qingyin Qian and Ryusuke Niwa

      1. General Statements [optional]

      We would like to thank reviewers for their feedback on our initial submission. Changes in figures were noted in the point-to-point reply. For submission of our current revised manuscript, we provide two Word files, which are the “clean” and “Track-and-Change” files. Page and line numbers described below correspond to those of the “clean” file. The “Track-and-Change” file might be helpful for Reviewers to find what we have changed for the current revision.

      In the revised manuscript, major changes in the text were tracked, while minor edits in figure numbers and legends were not tracked. In the Discussion, the section “Xrp1-mediated EE plasticity…” was moved before “Xrp1, a transcription factor …”, to follow the order of the Results, and was split into two: “EE plasticity …” and “Xrp1-mediated EE plasticity …”.

      2. Description of the planned revisions

      - The authors should investigate the regenerative growth of the adult midgut after irradiation. Is there an impact on ISCs proliferation or cell turn over. Is Xrp1 in EEs required in this adaptive response. It would be elegant to use the recently generated tracing method by Tobias Reiff lab to observe overall impact on tissue renewal (rapport-tracing esglexReDDM esg-lexA, 13xLexAop2-CD8::GFP, 13xLexAop2-H2B::mCherry::HA, tub-Gal80ts on the second chromosome. It can be combined with any EEs Gal4-driver (see Nat Commun 2025, https://doi.org/10.1038/s41467-024-55664-2, the stock is already existing, see table1). This reviewer thinks that it is a key experiment to support the proposed model.

      2.1. Author response:

      We will conduct the following experiments to answer these criticisms.

      (1) We will investigate the ISC behavior, proliferation and differentiation, after 100 Gy of radiation by examining changes in the number of progenitor cells and their progenies, using esgtsF/O (esg-Gal4, UAS-GFP, tub-Gal80ts; Act>Cd2>Gal4, UAS-Flp) generated in the study (Jiang et al. Cell 2009 DOI: 10.1016/j.cell.2009.05.014) or esgReDDM (esg-Gal4, UAS-CD8::GFP; UAS-H2B::RFP, tubGal80ts) generated in the study (Antonello et al. EMBO J. 2015 DOI: 10.15252/embj.201591517). Flies will have progenitor cell lineages traced for 7 days, irradiated on day 6, and examined at different time points after radiation, following the design shown in Fig. 2A. Based on the previous findings (Sharma et al. Sci. Rep. 2020 DOI: 10.1038/s41598-020-75867-z; Pyo et al. Radiat. Res. 2014 DOI: 10.1667/RR13545.1), we anticipate that radiation compromises ISCs’ proliferation and differentiation. Should this be the case, our results can be interpreted in relation to those earlier studies.

      (2) In parallel, we will examine whether Xrp1 expression in EEs affects radiation-induced ISC behaviors. As suggested, we will use “EE Rapport” (esg-lexA, 13xLexAop2-CD8::GFP, 13xLexAop2-H2B::mCherry::HA, tub-Gal80ts; Rab3-Gal4) generated in the study (Zipper et al. Nat. Commun. 2025 DOI: 10.1038/s41467-024-55664-2) and compare control flies to flies with Xrp1 knocked down in EEs to assess the impact on ISC behaviors.

      - Is p53 required for Xrp1 induction in the gut after irradiation?

      2.2. Author response:

      To answer this point, we will perform immunostaining of anti-Xrp1 antibody to examine whether p53 is required for Xrp1 induction in irradiated flies with p53 knocked down in EEs.

      - Xrp1 over expression has been shown to induce upd3 ligand and nutrient-driven dedifferentiation of enteroendocrine cells is occuring by activation of the JAK-STAT pathway (DOI: 10.1016/j.devcel.2023.08.022). Could the authors test the function of this signaling pathway during irradiation (upd3-lacZ and Stat-GFP can be used in parallel of upd3 RNAi and UAS Dome-DN.

      2.3. Author response:

      We will conduct the following experiments to answer these points.

      (1) We will examine the cell type in which upd3 ligand induction occurs after radiation by using the upd3.1-LacZ reporter generated in the study (Jiang et al. Cell Stem Cell 2011 DOI: doi.org/10.1016/j.stem.2010.11.026).

      (2) One possibility is that upd3.1-LacZ is detected in EEs. In this case, we will examine the requirement of upd3 in EEs for radiation-induced EE plasticity by knocking down upd3. Another possibility is that upd3.1-LacZ is detected in non-EE cells. If so, we will examine the requirement of the JAK-STAT pathway in EEs by overexpressing dome[△cyt] generated in the study (Brown et al. Curr. Biol. 2001 DOI: 10.1016/s0960-9822(01)00524-3) or knocking down Stat92E in EEs. Because these conditions are not mutually exclusive, both approaches may be pursued, with the latter relating our results to nutrient-driven EE dedifferentiation.

      - Xrp1 is known for its role in cell competition and elimination of looser cells by induction of apoptosis. It would be interesting to check for induction of cell death and/or caspase activation in the fly gut after irradiation and verify a non apoptotic role of DRONC activation in this context using a Dronc RNAi (as proposed by Bergmann lab (https://doi.org/10.1038/s41598-021-81261-0) or Baena-Lopez lab (DOI: 10.15252/embr.201948892)). Overexpression of Xrp1 could be combined with UAS-p35.

      2.4. Author response:

      To address these points, we will investigate apoptosis induction following radiation with anti-cleaved Dcp-1 immunostaining. Based on the previous finding (Sharma et al. Sci. Rep. 2020 DOI: 10.1038/s41598-020-75867-z), we anticipate seeing increased cleaved Dcp-1 signals in all cell types after radiation. We intend to clarify whether radiation increases the ratio of apoptotic EEs among EEs; however, we cannot yet be certain whether it will be feasible.

      Regarding Dronc activation, we previously requested the antibody used in the study (Wilson et al. Nat. Cell Biol. 2002 DOI: 10.1038/ncb799; Lindblad et al. Sci. Rep. 2021 DOI: 10.1038/s41598-021-81261-0) and tested it in our context, after radiation and by Xrp1-S O/E in EEs. We present our data below. In the anterior midgut, anti-Dronc signals were not observed under both control conditions. After radiation and by Xrp1-S O/E in EEs, anti-Dronc signals were seen in part of past EEs (#2 past) and progenitor cells (#3 prgn), implying their EB identity. However, anti-Dronc signals were never observed in current EEs (#1 current), suggesting Dronc does not act directly downstream to Xrp1.

      We will address UAS-p35 in 3.3. Author response and Dronc-RNAi in 4.2. Author response.

      - The authors do not justify or explain why they used 100 Gy of radiation. This is higher than doses used in comparable regeneration studies in adult Drosophila (e.g., PMID25959206, PMID: 28925355). The authors should clarify why this dose was chosen.

      2.5. Author response:

      Our initial rationale was based on the paper (Sharma et al. Sci. Rep. 2020 DOI: 10.1038/s41598-020-75867-z), where the authors claimed that ISC proliferation was inhibited and the ISC number was decreased by 100 Gy of radiation.

      Nevertheless, we understand the reviewer’s concern and will examine 50 Gy of radiation as used in the papers the reviewer listed. We will examine radiation-induced changes in EE lineages and ISC behaviors. Depending on the results, we will evaluate whether and how they should be incorporated into the manuscript.

      - Fig. 2C, the number of past EE’s increased transiently so that baseline number is restored at 18 hr after IR. The authors conclude that fate plasticity is a transient event. Can they rule out loss due to cell death?

      2.6. Author response:

      In our system, past EEs were detected transiently but did not persist. We agree that we cannot distinguish whether the transient appearance of past EEs reflects transient adoption of another identity that ends in cell death or reversible plasticity.

      To partially address this criticism, as noted in 2.4. Author response, we will examine the apoptosis marker cleaved Dcp-1, which also tests whether cleaved Dcp-1-positive cells can be past EEs. However, regardless of detecting apoptosis markers in past EEs, we have changed “transient” into “temporary” to describe a short-lived cell state (see Page 8, Line 178; Page 15, Line 338).

      - They authors interpret fate-conversion as beneficial for tissue repair but never test whether blocking this process impairs recovery or organismal survival or whether promoting it improves outcomes.

      2.7. Author response:

      We have removed this potentially misleading interpretation (see Page 4, removed the last part of the previous introduction, “and propose the possibility that such plasticity contributes to tissue repair”). We present below the data showing a severe reduction of the ISC number in 7-day post-radiation guts, suggesting the inability of tissue repair. We will add this to the manuscript together with results from the following experiments.

      (1) We will examine if the blockage of radiation-induced EE plasticity, via knocking down Xrp1 in EEs, alters the epithelial cell number and cell junction protein localization.

      (2) To complement the result of plasticity inhibition, we attempt to promote plasticity by overexpressing Xrp1 in EEs, to test whether this rescues ISC loss or restores junctions.

      Should knockdown worsen ISC loss and junction integrity, or overexpression rescue them, we will describe EE plasticity as beneficial; otherwise, we will present it as a radiation-induced response without inferring benefits, while noting our limitations.

      We will address organismal survival in 4.3. Author response.

      - Related to the above, it would be helpful to know if fate-converted cells function as true ISCs or ECs (e.g., through proliferation or absorption assays).

      2.8. Author response:

      To partially answer this criticism, we will examine whether EE-derived ISCs are proliferative by examining whether they can be positive for the mitotic marker phospho-histone 3.

      We will address absorption assays in 4.4. Author response.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      - It is surprising to observe EEs dedifferentiation at a steady state during homeostasis, a condition in which Xrp1 is not detected in the gut. Can the authors comment this point in the discussion?

      3.1. Author response:

      We have added our thoughts in terms of Xrp1 being not detectable in homeostatic EE lineages (see Page 15, Line 350 - 356). We have also added our thoughts regarding observation of EE plasticity in homeostatic guts (see Page 14, Line 322 - 332).

      - Xrp1 is existing as a short of long isoforms. The short form has been recently proposed to be required for cell competition (https://doi.org/10.1101/2025.06.15.659587) whereas Xrp1 long isoform may be responsible for reduced cell growth. Could the authors test which isoform is induced in the gut after irradiation? Is the overexpression of Xrp1 long isoform having the same effect that the short isoform used by the authors.

      3.2. Author response:

      We have added data on the effect of Xrp1 long isoform overexpression on EE plasticity (see Fig. 5A - 5B, Page 12, Line 276 - 278), showing that overexpression of the Xrp1 long isoform caused a similar increase in past EEs. In addition, we have changed Xrp1 O/E to Xrp1-S O/E in the contents related to Figs 4, 5, S4, and S5.

      We will address radiation-induced Xrp1 isoforms in 4.1. Author response.

      - Xrp1 is known for its role in cell competition and elimination of looser cells by induction of apoptosis. It would be interesting to check for induction of cell death and/or caspase activation in the fly gut after irradiation and verify a non apoptotic role of DRONC activation in this context using a Dronc RNAi (as proposed by Bergmann lab (https://doi.org/10.1038/s41598-021-81261-0) or Baena-Lopez lab (DOI: 10.15252/embr.201948892)). Overexpression of Xrp1 could be combined with UAS-p35.

      3.3. Author response:

      We have added data regarding p35 O/E combined with Xrp1 O/E, showing that p35 O/E did not further increase the number of past EEs, thereby suggesting that Xrp1-driven EE plasticity has a non-apoptotic nature (see Fig. 5C - 5D, Page 13, Line 293 - 297).

      - Line 221: fig S3E should be S3F

      - Line 230: fig S3F-G should be S3G-H

        • Line 230, Fig S3F-G should be Fig S3G-H.*

      3.4. Author response:

      We have fixed this error.

      - The posterior gut region R4 is more proliferative than the anterior part and is usually used for testing regenerative growth. What is happening there after irradiation?

      3.5. Author response:

      We present below radiation-induced changes in EE lineages and ISC number in the R4bc gut region. Radiation did not alter the proportion of past EEs among EE lineages but reduced the ISC number. We acknowledge differences between anterior and posterior gut regions, but we do not plan to further analyze regional differences or underlying mechanisms.

      - The authors’ explanation for cells with weak GFP in Figure 1 is not convincing. Induction of GFP is an all or nothing event as it results from Pros-driven FLPase and a recombination that removes the transcription stop signals to express GFP from a Ubi promotor. Once that happens, it should not matter how strong or weak Pros is, GFP should be the same. So, another explanation is needed. Nuclear staining of cell #2 in Fig 1B resembles a metaphase chromosome arrangement. Nuclear GFP may appear ‘weak’ in mitosis as the nuclear envelope breaks down. It is positive for the purple Pros/Dl stain, which makes it hard to tell if it is Pros+ or Pros- even though the authors state that cells with weak GFP are Pros- in line 104 (see the point above regarding confusing same-color stain for ISC and EE markers). Could cell #2 be a pre-EE that is undergoing mitosis since the lineage tracer marks both EE and pre-EE cells (line 119)? Or do the authors mean recombination on one or both homologs? This should not be possible since the cells are heterozygotes for the Ubi-GFP locus.

      3.6. Author response:

      For cell #5, RFP- GFPweak may result from the leakiness of the G-TRACE system. We have added our observations of the G-TRACE strains and changed our previous explanation (see Fig. S1B - S1C, Page 5, Line 94 - 97, 103 - 106).

      For cell #2, we agree that RFP+ GFPweak cells may either be a cell turning on pros expression just before sample preparation or a pre-EE undergoing mitosis. Nevertheless, it is not a past EE that has lost the EE marker Pros, so it is considered a current EE. We have removed our previous interpretation of cell #2 (see Page 5, removed “which likely had not yet fully activated recombination”), and changed the image to avoid confusion (see Fig. 1C).

      - Fig. 2C, if past-EE’s increased in number while current EE’s stayed the same, where are new past-EE’s coming from? There cannot be compensatory proliferations since EE’s are post-mitotic. For fate conversion, one would expect the generation of each past-EE to accompany loss of one current EE.

      3.7. Author response:

      We agree that the generation of one past EE should be accompanied by the loss of one current EE. We do not have a clear answer to this question. Our data showed cell numbers per ROI rather than the total cell number across the whole gut. To address this, we have changed the number to the proportion, calculated from [past EE] / ([past EE] + [current EE]), in experiments examining damage-induced EE plasticity, which provides a more informative measure for EE fate conversion (see Fig. 2C, also Fig. S2B and 3E).

      - Fig. 2E. Dl+ past-EE cell number declined at 14 and 18 h after IR and because cell sized increased, the authors conclude that EE cells that de-differentiated into ISCs subsequently re-differentiated into EC’s. To reach this conclusion, the authors should count past-EEs that are positive for EC markers. Cell size alone is insufficient evidence.

      3.8. Author response:

      We have added data quantifying the proportion of past EEs that are positive for the EC marker Pdm1, showing that past EEs were more likely to be ECs in guts examined 14 h after radiation (see Fig. 2F - 2G, Page 9, Line 189).

      - Fig. 6. Where are the % numbers for ISC, EB and EE’s coming from? And wouldn’t these change with time after IR, etc?

      3.9. Author response:

      The numbers came from the calculation of the percentage of the absolute values of control and 14 h post-IR conditions from Fig. 2E. These numbers changed with time after radiation. We realized that the precise numbers were misleading. We therefore have removed such illustration and instead added phrases “more current EEs → past EEs, more past EEs being ISCs → past EEs being ECs” to describe the increase in past EE cell number and the shift in the composition of past EEs (see Fig. 6).

      - Improve Figure 1B: Pros and Dl are shown in the same color, creating confusion. If both are stained together, different colors or clearer labeling should be used. Clarify how cells are identified as Pros+ vs Dl+.

      3.10. Author response:

      Anti-Pros and anti-Dl antibodies were produced from the same host species and were detected with the same secondary antibody, so they were in the same color. We have stated that solid nuclear staining indicates Pros, whereas punctate cytoplasmic staining indicates Dl (see Page 5, Line 100, 102, and 103). Such staining has been reported in previous studies (for example, Fig. 2A - 2B, Veneti et al. Nat. Commun. 2024 DOI: 10.1038/s41467-024-46119-9).

      - Why is Dl (supposed to be cytoplasmic) overlapping with nuclear GFP in cells #3 and 4 in Fig. 1B?

      3.11. Author response:

      Because Dl signals were located apically to DAPI/GFP signals, the overlap was likely due to Z-projection from stacked slices. We present below orthogonal slices along the z-axis, from top to bottom by row, and composite and individual color channels, from left to right by columns, for cell #3 (left) and cell #4 (right).

      For cell #3, Dl signals were present in slices 1/8 and 2/8 and disappeared in slice 3/8, whereas DAPI signals appeared from slice 2/8. For cell #4, Dl signals surrounded DAPI signals when viewed separately. In addition, we realized that nuclear GFP signals slightly outgrew DAPI signals, despite our confirmation that the GFP channel was not saturated.

      We have included separate color channels for DAPI signals and Pros, Dl and DAPI merged channels, showing that Dl signals were absent from the nucleus. For cell #3, in which the nuclear DAPI and cytoplasmic Dl cannot be distinguished in the stacked view, we show the images from a single orthogonal slice in the main panel, and the image from stacked slices as insets (see Fig. 1C).

      - Fig. S1E and F. Very hard to see what the authors describe about Arm and Cora. One problem is that cell boundaries are not visible, just the nuclei, so it is hard to know whether cell-cell interactions the authors describe as normal are really normal. Another problem is the overlap of Arm (supposed to be cytoplasmic) with the nuclear GFP signal. What is that?

      3.12. Author response:

      Regarding the invisibility of cell boundaries, we have improved the image of anti-Cora staining and added anti-Mesh staining and a separate color channel for DAPI signals to reinforce junction integrity (see Fig. S1H - S1I).

      Regarding the overlap of Arm signals with nuclear GFP signals, we realized similar problems as those noted in 3.11. Author response. We present below orthogonal slices along the z-axis and combined and individual color channels, for cell #2 (left) and cell #3 (right). For both cells, Arm signals did not overlap with DAPI signals. We have adjusted the maximum intensity projection to include slices 1-4 instead of 1-8 and added a separate color channel for DAPI signals to avoid the signals appearing to overlap (see Fig. S1G).

      - Include a simple schematic of ISC to EE/EC lineages for readers unfamiliar with Drosophila gut biology.

      3.13. Author response:

      We have included a schematic (see Fig. 1A). Although not requested, we have also improved Fig. 1B to enhance clarity.

      - Discuss the regional difference in Xrp1 efficacy (R2a vs R2b). Is there something known about gene expression differences in different gut regions that can explain the results?

      3.14. Author response:

      At present, we do not have an explanation for these results. We have refined our discussion regarding such regional differences (see Page 16 - 17, Line 381 - 390).

      - Consider moving scRNAseq (Fig. S1G) into main paper: this is a central part of the conclusion.

      3.15. Author response:

      We have moved Fig. S1G, as well as Fig. S1H and S1I, into the main figure (see Fig. 1G - 1I).

      4. Description of analyses that authors prefer not to carry out

      - Xrp1 is existing as a short of long isoforms. The short form has been recently proposed to be required for cell competition (https://doi.org/10.1101/2025.06.15.659587) whereas Xrp1 long isoform may be responsible for reduced cell growth. Could the authors test which isoform is induced in the gut after irradiation? Is the overexpression of Xrp1 long isoform having the same effect that the short isoform used by the authors.

      4.1. Author response:

      We prefer not to distinguish whether the long or short Xrp1 isoform is induced in the gut after radiation. This presents technical challenges and falls outside the scope of the present study. As noted in 3.2. Author response, we instead report in the revised manuscript that both isoforms similarly promote EE plasticity.

      - Xrp1 is known for its role in cell competition and elimination of looser cells by induction of apoptosis. It would be interesting to check for induction of cell death and/or caspase activation in the fly gut after irradiation and verify a non apoptotic role of DRONC activation in this context using a Dronc RNAi (as proposed by Bergmann lab (https://doi.org/10.1038/s41598-021-81261-0) or Baena-Lopez lab (DOI: 10.15252/embr.201948892)). Overexpression of Xrp1 could be combined with UAS-p35.

      4.2. Author response:

      We prefer not to perform Dronc-RNAi, because we did not observe Dronc activation downstream to Xrp1, as shown in 2.4. Author response.

      - They authors interpret fate-conversion as beneficial for tissue repair but never test whether blocking this process impairs recovery or organismal survival or whether promoting it improves outcomes.

      4.3. Author response:

      We prefer not to examine organismal survival. We agree that organismal survival would be informative, but our study focuses on epithelial cell number, which will be tested as noted in 2.7. Author response. We will not mention broad claims at the organismal level.

      - Related to the above, it would be helpful to know if fate-converted cells function as true ISCs or ECs (e.g., through proliferation or absorption assays).

      4.4. Author response:

      We prefer not to perform absorptive assays due to technical challenges. We will instead test proliferation, as noted in 2.8. Author response, and note our limitations.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Qian and colleagues report a study on radiation induced cell fate plasticity in the intestine of Drosophila. Using lineage tracing to mark pre-EE and EE cells, the authors how that these cells can lose EE/pre-EE marker Pros and express ISC or EC markers, indicating fate conversion. Single cell RNAseq analysis showed that even under basal conditions, ISC/EB cell population includes those with EE/pre-EE lineage tracer, confirming fate conversion. The same analysis showed that fate converted ISC/EB cells express transcription factor Ets21C, which is associated with regeneration but not normal development. Exposure to ionizing radiation (IR) increases the frequency of fate conversion and accompanies the induction of Xrp1 (which is not expressed normally in the EE's). Xrp1 knock down reduced IR-induced fate conversion, demonstrating necessity. Xrp1 is also sufficient because overexpression of it resulted in increased fate conversion without IR. scRNAseq analysis showed that overexpression of Xrp1 in pre-EE/EE cells (without IR) resulted in the induction of ISC/progenitor state genes such as esg and Sox homologs. Functional testing of the latter group of genes demonstrated their essential role in cell fate plasticity induced by Xrp1.

      Major comments

      • The authors do not justify or explain why they used 100 Gy of radiation. This is higher than doses used in comparable regeneration studies in adult Drosophila (e.g., PMID25959206, PMID: 28925355). The authors should clarify why this dose was chosen.
      • The authors' explanation for cells with weak GFP in Figure 1 is not convincing. Induction of GFP is an all or nothing event as it results from Pros-driven FLPase and a recombination that removes the transcription stop signals to express GFP from a Ubi promotor. Once that happens, it should not matter how strong or weak Pros is, GFP should be the same. So, another explanation is needed. Nuclear staining of cell #2 in Fig 1B resembles a metaphase chromosome arrangement. Nuclear GFP may appear 'weak' in mitosis as the nuclear envelope breaks down. It is positive for the purple Pros/Dl stain, which makes it hard to tell if it is Pros+ or Pros- even though the authors state that cells with weak GFP are Pros- in line 104 (see the point above regarding confusing same-color stain for ISC and EE markers). Could cell #2 be a pre-EE that is undergoing mitosis since the lineage tracer marks both EE and pre-EE cells (line 119)? Or do the authors mean recombination on one or both homologs? This should not be possible since the cells are heterozygotes for the Ubi-GFP locus.
      • Fig. 2C, if past-EE's increased in number while current EE's stayed the same, where are new past-EE's coming from? There cannot be compensatory proliferations since EE's are post-mitotic. For fate conversion, one would expect the generation of each past-EE to accompany loss of one current EE.
      • Fig. 2C, the number of past EE's increased transiently so that baseline number is restored at 18 hr after IR. The authors conclude that fate plasticity is a transient event. Can they rule out loss due to cell death?
      • Fig. 2E. Dl+ past-EE cell number declined at 14 and 18 h after IR and because cell sized increased, the authors conclude that EE cells that de-differentiated into ISCs subsequently re-differentiated into EC's. To reach this conclusion, the authors should count past-EEs that are positive for EC markers. Cell size alone is insufficient evidence.
      • Fig. 6. Where are the % numbers for ISC, EB and EE's coming from? And wouldn't these change with time after IR, etc?
      • They authors interpret fate-conversion as beneficial for tissue repair but never test whether blocking this process impairs recovery or organismal survival or whether promoting it improves outcomes.
      • Related to the above, it would be helpful to know if fate-converted cells function as true ISCs or ECs (e.g., through proliferation or absorption assays).

      Minor comments

      • Improve Figure 1B: Pros and Dl are shown in the same color, creating confusion. If both are stained together, different colors or clearer labeling should be used. Clarify how cells are identified as Pros+ vs Dl+.
      • Why is Dl (supposed to be cytoplasmic) overlapping with nuclear GFP in cells #3 and 4 in Fig. 1B?
      • Fig. S1E and F. Very hard to see what the authors describe about Arm and Cora. One problem is that cell boundaries are not visible, just the nuclei, so it is hard to know whether cell-cell interactions the authors describe as normal are really normal. Another problem is the overlap of Arm (supposed to be cytoplasmic) with the nuclear GFP signal. What is that?
      • Include a simple schematic of ISC to EE/EC lineages for readers unfamiliar with Drosophila gut biology.
      • Discuss the regional difference in Xrp1 efficacy (R2a vs R2b). Is there something known about gene expression differences in different gut regions that can explain the results?
      • Consider moving scRNAseq (Fig. S1G) into main paper: this is a central part of the conclusion.
      • Line 230, Fig S3F-G should be Fig S3G-H.

      Significance

      Xrp1 is known to have a role in DNA Damage Responses and in cell competition and to function in the context of the p53 network, but this is the first time its role in fate conversion has been demonstrated. For the most part, the data are convincing and include strong genetic evidence from loss- and gain-of-function approaches that demonstrate a role for Xrp1 in activating progenitor gene expression and fate conversion. However, there are several experimental and presentation issues that need to be addressed first as outlined in the previous sections.

      The work highlights how mature cells may revert to stem-like states in response to injury, a theme with broad relevance in regenerative medicine.

      My field of expertise lies in DNA damage responses in Drosophila and human cancer models.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      We thank the reviewer for his valuable input and careful assessment, which have significantly improved the clarity and rigor of our manuscript.

      Summary:

      Mazer & Yovel 2025 dissect the inverse problem of how echolocators in groups manage to navigate their surroundings despite intense jamming using computational simulations.

      The authors show that despite the 'noisy' sensory environments that echolocating groups present, agents can still access some amount of echo-related information and use it to navigate their local environment. It is known that echolocating bats have strong small and large-scale spatial memory that plays an important role for individuals. The results from this paper also point to the potential importance of an even lower-level, short-term role of memory in the form of echo 'integration' across multiple calls, despite the unpredictability of echo detection in groups. The paper generates a useful basis to think about the mechanisms in echolocating groups for experimental investigations too.

      Strengths:

      (1) The paper builds on biologically well-motivated and parametrised 2D acoustics and sensory simulation setup to investigate the various key parameters of interest

      (2) The 'null-model' of echolocators not being able to tell apart objects & conspecifics while echolocating still shows agents successfully emerge from groups - even though the probability of emergence drops severely in comparison to cognitively more 'capable' agents. This is nonetheless an important result showing the directionof-arrival of a sound itself is the 'minimum' set of ingredients needed for echolocators navigating their environment.

      (3) The results generate an important basis in unraveling how agents may navigate in sensorially noisy environments with a lot of irrelevant and very few relevant cues.

      (4) The 2D simulation framework is simple and computationally tractable enough to perform multiple runs to investigate many variables - while also remaining true to the aim of the investigation.

      Weaknesses:

      There are a few places in the paper that can be misunderstood or don't provide complete details. Here is a selection:

      (1) Line 61: '... studies have focused on movement algorithms while overlooking the sensory challenges involved' : This statement does not match the recent state of the literature. While the previous models may have had the assumption that all neighbours can be detected, there are models that specifically study the role of limited interaction arising from a potential inability to track all neighbours due to occlusion, and the effect of responding to only one/few neighbours at a time e.g. Bode et al. 2011 R. Soc. Interface, Rosenthal et al. 2015 PNAS, Jhawar et al. 2020 Nature Physics.

      We appreciate the reviewer's comment and the relevant references. We have revised the manuscript accordingly to clarify the distinction between studies that incorporate limited interactions and those that explicitly analyze sensory constraints and interference. We have refined our statement to acknowledge these contributions while maintaining our focus on sensory challenges beyond limited neighbor detection, such as signal degradation, occlusion effects, and multimodal sensory integration (see lines 58-64):

      (2) The word 'interference' is used loosely places (Line 89: '...took all interference signals...', Line 319: 'spatial interference') - this is confusing as it is not clear whether the authors refer to interference in the physics/acoustics sense, or broadly speaking as a synonym for reflections and/or jamming.

      To improve clarity, we have revised the manuscript to distinguish between different types of interference:

      • Acoustic interference (jamming): Overlapping calls that completely obscure echo detection, preventing bats from perceiving necessary environmental cues.

      • Acoustic interference (masking): Partial reduction in signal clarity due to competing calls.

      • Spatial interference: Physical obstruction by conspecifics affecting movement and navigation.

      We have updated the manuscript to use these terms consistently and explicitly define them in relevant sections (see lines 84-85, 119-120). This distinction ensures that the reader can differentiate between interference as an acoustic phenomenon and its broader implications in navigation.

      (3) The paper discusses original results without reference to how they were obtained or what was done. The lack of detail here must be considered while interpreting the Discussion e.g. Line 302 ('our model suggests...increasing the call-rate..' - no clear mention of how/where call-rate was varied) & Line 323 '..no benefit beyond a certain level..' - also no clear mention of how/where call-level was manipulated in the simulations.

      All tested parameters, including call rate dynamics and call intensity variations, are detailed in the Methods section and Tables 1 and 2. Specifically:

      • Call Rate Variation: The Inter-Pulse Interval (IPI) was modeled based on documented echolocation behavior, decreasing from 100 msec during the search phase to 35 msec (~28 calls per second) at the end of the approach phase, and to 5 msec (200 calls per second) during the final buzz (see Table 2). This natural variation in call rate was not manually manipulated in the model but emerged from the simulated bat behavior.

      • Call Intensity Variation: The tested call intensity levels (100, 110, 120, 130 dB SPL) are presented in Table 1 under the “Call Level” parameter. The effect of increasing call intensity was analyzed in relation to exit probability, jamming probability, and collision rate. This is now explicitly referenced in the Discussion. We have revised the manuscript to explicitly reference these aspects in the Results and Discussion sections – see lines 346-349, 372-375.

      Reviewer #2 (Public review):

      We are grateful for the reviewer’s insightful feedback, which has helped us clarify key aspects of our research and strengthen our conclusions.

      This manuscript describes a detailed model of bats flying together through a fixed geometry. The model considers elements that are faithful to both bat biosonar production and reception and the acoustics governing how sound moves in the air and interacts with obstacles. The model also incorporates behavioral patterns observed in bats, like one-dimensional feature following and temporal integration of cognitive maps. From a simulation study of the model and comparison of the results with the literature, the authors gain insight into how often bats may experience destructive interference of their acoustic signals and those of their peers, and how much such interference may actually negatively affect the groups' ability to navigate effectively. The authors use generalized linear models to test the significance of the effects they observe.

      In terms of its strengths, the work relies on a thoughtful and detailed model that faithfully incorporates salient features, such as acoustic elements like the filter for a biological receiver and temporal aggregation as a kind of memory in the system. At the same time, the authors' abstract features are complicating without being expected to give additional insights, as can be seen in the choice of a twodimensional rather than three-dimensional system. I thought that the level of abstraction in the model was perfect, enough to demonstrate their results without needless details. The results are compelling and interesting, and the authors do a great job discussing them in the context of the biological literature. 

      The most notable weakness I found in this work was that some aspects of the model were not entirely clear to me. 

      For example, the directionality of the bat's sonar call in relation to its velocity. Are these the same?

      For simplicity, in our model, the head is aligned with the body, therefore the direction of the echolocation beam is the same as the direction of the flight. 

      Moreover, call directionality (directivity) is not directly influenced by velocity. Instead, directionality is estimated using the piston model, as described in the Methods section. The directionality is based on the emission frequency and is thus primarily linked to the behavioral phases of the bat, with frequency shifts occurring as the bat transitions from search to approach to buzz phases. During the approach phase, the bat emits calls with higher frequencies, resulting in increased directionality. This is supported by the literature (Jakobsen and Surlykke, 2010; Jakobsen, Brinkløv and Surlykke, 2013). This phase is also associated with a natural reduction in flight speed, which is a well-documented behavioral adaptation in echolocating bats(Jakobsen et al., 2024).

      To clarify this in the manuscript, we have updated the text to explicitly state that directionality follows phase-dependent frequency changes rather than being a direct function of velocity, see lines 543-545. 

      If so, what is the difference between phi_target and phi_tx in the model equations? 

      𝝓<sub>𝒕𝒂𝒓𝒈𝒆𝒕</sub> represents the angle between the bat and the reflected object (target).

      𝝓<sub>𝑻𝒙</sub> the angle [rad], between the masking bat and target (from the transmitter’s perspective)

      𝝓<sub>𝑻𝒙𝑹𝒙</sub> refers to the angle between the transmitting conspecific and the receiving focal bat, from the transmitter’s point of view.

      𝝓<sub>𝑹𝒙𝑻𝒙</sub> represents the angle between the receiving bat and the transmitting bat, from the receiver’s point of view.

      These definitions have been explicitly stated in the revised manuscript to prevent any ambiguity (lines 525-530). Additionally, a Supplementary figure demonstrating the geometrical relations has been added to the manuscript.

      What is a bat's response to colliding with a conspecific (rather than a wall)? 

      In nature, minor collisions between bats are common and typically do not result in significant disruptions to flight (Boerma et al., 2019; Roy et al., 2019; Goldshtein et al., 2025). Given this, our model does not explicitly simulate the physical impact of a collision event. Instead, during the collision event the bat keeps decreasing its velocity and changing its flight direction until the distance between bats is above the threshold (0.4 m). We assume that the primary cost of such interactions arises from the effort required to avoid collisions, rather than from the collision itself. This assumption aligns with observations of bat behavior in dense flight environments, where individuals prioritize collision avoidance rather than modeling post-collision dynamics. See lines 479-484.

      From the statistical side, it was not clear if replicate simulations were performed. If they were, which I believe is the right way due to stochasticity in the model, how many replicates were used, and are the standard errors referred to throughout the paper between individuals in the same simulation or between independent simulations, or both? 

      The number of repetitions for each scenario is detailed in Table 1, but we included it in a more prominent location in the text for clarity. Specifically, we now state (Lines 110-111):

      "The number of repetitions for each scenario was as follows: 1 bat: 240; 2 bats: 120; 5 bats: 48; 10 bats: 24; 20 bats: 12; 40 bats: 12; 100 bats: 6."

      Regarding the reported standard errors, they are calculated across all individuals within each scenario, without distinguishing between different simulation trials. 

      We clarified in the revised text (Lines 627-628 in Statistical Analysis) 

      Overall, I found these weaknesses to be superficial and easily remedied by the authors. The authors presented well-reasoned arguments that were supported by their results, and which were used to demonstrate how call interference impacts the collective's roost exit as measured by several variables. As the authors highlight, I think this work is valuable to individuals interested in bat biology and behavior, as well as to applications in engineered multi-agent systems like robotic swarms.

      Reviewer #3 (Public review):

      We sincerely appreciate the reviewer’s thoughtful comments and the time invested in evaluating our work, which have greatly contributed to refining our study.

      We would like to note that in general, our model often simplifies some of the bats’ abilities, under the assumption that if the simulated bats manage to perform this difficult task with simpler mechanisms, real better adapted bats will probably perform even better. This thought strategy will be repeated in several of the s below.

      Summary:

      The authors describe a model to mimic bat echolocation behavior and flight under high-density conditions and conclude that the problem of acoustic jamming is less severe than previously thought, conflating the success of their simulations (as described in the manuscript) with hard evidence for what real bats are actually doing. The authors base their model on two species of bats that fly at "high densities" (defined by the authors as colony sizes from tens to tens of thousands of individuals and densities of up to 33.3 bats/m2), Pipistrellus kuhli and Rhinopoma microphyllum. This work fits into the broader discussion of bat sensorimotor strategies during collective flight, and simulations are important to try to understand bat behavior, especially given a lack of empirical data. However, I have major concerns about the assumptions of the parameters used for the simulation, which significantly impact both the results of the simulation and the conclusions that can be made from the data. These details are elaborated upon below, along with key recommendations the authors should consider to guide the refinement of the model.

      Strengths:

      This paper carries out a simulation of bat behavior in dense swarms as a way to explain how jamming does not pose a problem in dense groups. Simulations are important when we lack empirical data. The simulation aims to model two different species with different echolocation signals, which is very important when trying to model echolocation behavior. The analyses are fairly systematic in testing all ranges of parameters used and discussing the differential results.

      Weaknesses:

      The justification for how the different foraging phase call types were chosen for different object detection distances in the simulation is unclear. Do these distances match those recorded from empirical studies, and if so, are they identical for both species used in the simulation? 

      The distances at which bats transition between echolocation phases are identical for both species in our model (see Table 2). These distances are based on welldocumented empirical studies of bat hunting and obstacle avoidance behavior (Griffin, Webster and Michael, 1958; Simmons and Kick, 1983; Schnitzler et al., 1987; Kalko, 1995; Hiryu et al., 2008; Vanderelst and Peremans, 2018). These references provide extensive evidence that insectivorous bats systematically adjust their echolocation calls in response to object proximity, following the characteristic phases of search, approach, and buzz.

      To improve clarity, we have updated the text to explicitly state that the phase transition distances are empirically grounded and apply equally to both modeled species (lines 499-508).

      What reasoning do the authors have for a bat using the same call characteristics to detect a cave wall as they would for detecting a small insect? 

      In echolocating bats, call parameters are primarily shaped by the target distance and echo strength. Accordingly, there is little difference in call structure between prey capture and obstacles-related maneuvers, aside from intensity adjustments based on target strength (Hagino et al., 2007; Hiryu et al., 2008; Surlykke, Ghose and Moss, 2009; Kothari et al., 2014). In our study, due to the dense cave environment, the bats are found to operate in the approach phase most of the time, which is consistent with natural cave emergence, where they are navigating through a cluttered environment rather than engaging in open-space search. For one of the species (Rhinopoma), we also have empirical recordings of individuals flying under similar conditions (Goldshtein et al., 2025). Our model was designed to remain as simple as possible while relying on conservative assumptions that may underestimate bat performance. If, in reality, bats fine-tune their echolocation calls even earlier or more precisely during navigation than assumed, our model would still conservatively reflect their actual capabilities. See lines 500-508.

      The two species modeled have different calls. In particular, the bandwidth varies by a factor of 10, meaning the species' sonars will have different spatial resolutions. Range resolution is about 10x better for PK compared to RM, but the authors appear to use the same thresholds for "correct detection" for both, which doesn't seem appropriate.

      The detection process in our model is based on Saillant’s method using a filterbank, as detailed in the paper (Saillant et al., 1993; Neretti et al., 2003; Sanderson et al., 2003). This approach inherently incorporates the advantages of a wider bandwidth, meaning that the differences in range resolution between the species are already accounted for within the signal-processing framework. Thus, there is no need to explicitly adjust the model parameters for bandwidth variations, as these effects emerge from the applied method.

      Also, the authors did not mention incorporating/correcting for/exploiting Doppler, which leads me to assume they did not model it.

      The reviewer is correct. To maintain model simplicity, we did not incorporate the Doppler effect or its impact on echolocation. The exclusion of Doppler effects was based on the assumption that while Doppler shifts can influence frequency perception, their impact on jamming and overall navigation performance is minor within the modelled context.

      The maximal Doppler shifts expected for the bats in this scenario are of ~ 1kHz. These shifts would be applied variably across signals due to the semi-random relative velocities between bats, leading to a mixed effect on frequency changes. This variability would likely result in an overall reduction in jamming rather than exacerbating it, aligning with our previous statement that our model may overestimate the severity of acoustic interference. Such Doppler shifts would result in errors of 2-4 cm in localization (i.e., 200-400 micro-seconds) (Boonman, Parsons and Jones, 2003).

      We have now explicitly highlighted this in the revised version (see 548-581).

      The success of the simulation may very well be due to variation in the calls of the bats, which ironically enough demonstrates the importance of a jamming avoidance response in dense flight. This explains why the performance of the simulation falls when bats are not able to distinguish their own echoes from other signals. For example, in Figure C2, there are calls that are labeled as conspecific calls and have markedly shorter durations and wider bandwidths than others. These three phases for call types used by the authors may be responsible for some (or most) of the performance of the model since the correlation between different call types is unlikely to exceed the detection threshold. But it turns out this variation in and of itself is what a jamming avoidance response may consist of. So, in essence, the authors are incorporating a jamming avoidance response into their simulation. 

      We fully agree that the natural variations in call design between the phases contribute significantly to interference reduction (see our discussion in a previous paper in Mazar & Yovel, 2020). However, we emphasize that this cannot be classified as a Jamming Avoidance Response (JAR). In our model, bats respond only to the physical presence of objects and not to the acoustic environment or interference itself. There is no active or adaptive adjustment of call design to minimize jamming beyond the natural phase-dependent variations in call structure. Therefore, while variation in call types does inherently reduce interference, this effect emerges passively from the modeled behavior rather than as an intentional strategy to avoid jamming. 

      The authors claim that integration over multiple pings (though I was not able to determine the specifics of this integration algorithm) reduces the masking problem. Indeed, it should: if you have two chances at detection, you've effectively increased your SNR by 3dB.  

      The reviewer is correct. Indeed, integration over multiple calls improves signal-tonoise ratio (SNR), effectively increasing it by approximately 3 dB per doubling of observations. The specifics of the integration algorithm are detailed in the Methods section, where we describe how sensory information is aggregated across multiple time steps to enhance detection reliability.

      They also claim - although it is almost an afterthought - that integration dramatically reduces the degradation caused by false echoes. This also makes sense: from one ping to the next, the bat's own echo delays will correlate extremely well with the bat's flight path. Echo delays due to conspecifics will jump around kind of randomly. However, the main concern is regarding the time interval and number of pings of the integration, especially in the context of the bat's flight speed. The authors say that a 1s integration interval (5-10 pings) dramatically reduces jamming probability and echo confusion. This number of pings isn't very high, and it occurs over a time interval during which the bat has moved 5-10m. This distance is large compared to the 0.4m distance-to-obstacle that triggers an evasive maneuver from the bat, so integration should produce a latency in navigation that significantly hinders the ability to avoid obstacles. Can the authors provide statistics that describe this latency, and discussion about why it doesn't seem to be a problem? 

      As described in the Methods section, the bat’s collision avoidance response does not solely rely on the integration process. Instead, the model incorporates real-time echoes from the last calls, which are used independently of the integration process for immediate obstacle avoidance maneuvers. This ensures that bats can react to nearby obstacles without being hindered by the integration latency. The slower integration on the other hand is used for clustering, outlier removal and estimation wall directions to support the pathfinding process, as illustrated in Supplementary Figure 1.

      Additionally, our model assumes that bats store the physical positions of echoes in an allocentric coordinate system (x-y). The integration occurs after transforming these detections from a local relative reference frame to a global spatial representation. This allows for stable environmental mapping while maintaining responsiveness to immediate changes in the bat’s surroundings.

      See lines 600-616 in the revised version.

      The authors are using a 2D simulation, but this very much simplifies the challenge of a 3D navigation task, and there is an explanation as to why this is appropriate. Bat densities and bat behavior are discussed per unit area when realistically it should be per unit volume. In fact, the authors reference studies to justify the densities used in the simulation, but these studies were done in a 3D world. If the authors have justification for why it is realistic to model a 3D world in a 2D simulation, I encourage them to provide references justifying this approach. 

      We acknowledge that this is a simplification; however, from an echolocation perspective, a 2D framework represents a worst-case scenario in terms of bat densities and maneuverability:

      • Higher Effective Density: A 2D model forces all bats into a single plane rather than distributing them through a 3D volume, increasing the likelihood of overlap in calls and echoes and making jamming more severe. As described in the text: the average distance to the nearest bat in our simulation is 0.27m (with 100 bats), whereas reported distances in very dense colonies are 0.5m (Fujioka et al., 2021), as observed in Myotis grisescens (Sabol and Hudson, 1995) and Tadarida brasiliensis (Theriault et al., no date; Betke et al., 2008; Gillam et al., 2010)

      • Reduced Maneuverability: In 3D space, bats can use vertical movement to avoid obstacles and conspecifics. A 2D constraint eliminates this degree of freedom, increasing collision risk and limiting escape options.

      Thus, our 2D model provides a conservative difficult test case, ensuring that our findings are valid under conditions where jamming and collision risks are maximized. Additionally, the 2D framework is computationally efficient, allowing us to perform multiple simulation runs to explore a broad parameter space and systematically test the impact of different variables.

      To address the reviewer’s concern, we have clarified this justification in the revised text and will provide supporting references where applicable (see Methods lines 450455).

      The focus on "masking" (which appears to be just in-band noise), especially relative to the problem of misassigned echoes, is concerning. If the bat calls are all the same waveform (downsweep linear FM of some duration, I assume - it's not clear from the text), false echoes would be a major problem. Masking, as the authors define it, just reduces SNR. This reduction is something like sqrt(N), where N is the number of conspecifics whose echoes are audible to the bat, so this allows the detection threshold to be set lower, increasing the probability that a bat's echo will exceed a detection threshold. False echoes present a very different problem. They do not reduce SNR per se, but rather they cause spurious threshold excursions (N of them!) that the bat cannot help but interpret as obstacle detection. I would argue that in dense groups the mis-assignment problem is much more important than the SNR problem. 

      There is substantial literature supporting the assumption that bats can recognize their own echoes and distinguish them from conspecific signals (Schnitzler, Bioscience and 2001, no date; Kazial, Burnett and Masters, 2001; Burnett and Masters, 2002; Kazial, Kenny and Burnett, 2008; Chili, Xian and Moss, 2009; Yovel et al., 2009; Beetz and Hechavarría, 2022)). However, we acknowledge that false echoes may present a major challenge in dense groups. To address this, we explicitly tested the impact of the self-echo identification assumption in our study see Results Figure 1: The impact of confusion on performance, and lines 399-404 in the Discussion.

      Furthermore, we examined a full confusion scenario, where all reflected echoes from conspecifics were misinterpreted as obstacle reflections (i.e., 100% confusion). Our results show that this significantly degrades navigation performance, supporting the argument that echo misassignment is a critical issue. However, we also explored a simple mitigation strategy based on temporal integration with outlier rejection, which provided some improvement in performance. This suggests that real bats may possess additional mechanisms to enhance self-echo identification and reduce false detections. See lines 411-420 in the manuscript for further discussion. 

      We actually used logarithmically frequency modulated (FM) chirps, generated using the MATLAB built-in function chirp(t, f0, t1, f1, 'logarithmic'). This method aligns with the nonlinear FM characteristics of Pipistrellus kuhlii (PK) and Rhinopoma microphyllum (RM) and provides a realistic approximation of their echolocation signals. We acknowledge that this was not sufficiently emphasized in the original text, and we have now explicitly highlighted this in the revised version to ensure clarity (see Lines 509-512 in Methods).

      The criteria set for flight behavior (lines 393-406) are not justified with any empirical evidence of the flight behavior of wild bats in collective flight. How did the authors determine the avoidance distances? Also, what is the justification for the time limit of 15 seconds to emerge from the opening? Instead of an exit probability, why not instead use a time criterion, similar to "How long does it take X% of bats to exit?"  :

      While we acknowledge that wild bats may employ more complex behaviors for collision avoidance, we chose to implement a simplified decision-making rule in our model to maintain computational tractability.

      The avoidance distances (1.5 m from walls and 0.4 m from other bats) were selected as internal parameters to support stable and realistic flight trajectories while maintaining a reasonable collision rate. These values reflect a trade-off between maneuverability and behavioral coherence under crowding. To address this point, we added a sensitivity analysis to the revised manuscript. Specifically, we tested the effect of varying the conspecific avoidance distance from 0.2 to 1.6 meters at bat densities of 2 to 40 bats/3m². The only statistically significant impact was at the highest density (40 bats/3m²), where exit probability increased slightly from 82% to 88% (p = 0.024, t = 2.25, DF = 958). No significant changes were observed in exit time, collision rate, or jamming probability across other densities or conditions (GLM, see revised Methods). These results suggest that the selected avoidance distances are robust and not a major driver of model performance, see lines 469-47.

      The 15-second exit limit was determined as described in the text (Lines 489-491): “A 15-second window was chosen because it is approximately twice the average exit time for 40 bats and allows for a second corrective maneuver if needed.” In other words, it allowed each bat to circle the ‘cave’ twice to exit even in the most crowded environment. This threshold was set to keep simulation time reasonable while allowing sufficient time for most bats to exit successfully.

      We acknowledge that the alternative approach suggested by the reviewer— measuring the time taken for a certain percentage of bats to exit—is also valid. However, in our model, some outlier bats fail to exit and continue flying for many minutes, such simulations would lead to excessive simulation times making it difficult to generate repetitions and not teaching us much – they usually resulted from the bat slightly missing the opening (see video S1. Our chosen approach ensures practical runtime constraints while still capturing relevant performance metrics.

      What is the empirical justification for the 1-10 calls used for integration?  

      The "average exit time for 40 bats" is also confusing and not well explained. Was this determined empirically? From the simulation? If the latter, what are the conditions?

      Does it include masking, no masking, or which species? 

      Previous studies have demonstrated that bats integrate acoustic information received sequentially over several echolocation calls (2-15), effectively constructing an auditory scene in complex environments (Ulanovsky and Moss, 2008; Chili, Xian and Moss, 2009; Moss and Surlykke, 2010; Yovel and Ulanovsky, 2017; Salles, Diebold and Moss, 2020). Additionally, bats are known to produce echolocation sound groups when spatiotemporal localization demands are high (Kothari et al., 2014). Studies have documented call sequences ranging from 2 to 15 grouped calls (Moss and Surlykke, 2010), and it has been hypothesized that grouping facilitates echo segregation.

      We did not use a single integration window - we tested integration sizes between 1 and 10 calls and presented the results in Figure 3A. This range was chosen based on prior empirical findings and to explore how different levels of temporal aggregation impact navigation performance. Indeed, the results showed that the performance levels between 5-10 calls integration window (Figure 3A)

      Regarding the average exit time for 40 bats, this value was determined from our simulations, where it represents the mean time for successful exits under standard conditions with masking. We have revised the text to clarify these details see, lines 489-491.

      Reviewer #1 (Recommendations for the authors):

      (1) Data Availability:

      As it stands now, this reviewer cannot vouch for the uploaded code as it wasn't accessible according to F.A.I.R principles. The link to the code/data points to a private company's file-hosting account that requires logging in or account creation to see its contents, and thus cannot be accessed.

      This reviewer urges the authors to consider uploading the code onto an academic data repository from the many on offer (e.g. Dryad, Zenodo, OSF). Some repositories offer an option to share a private link (e.g. Zenodo) to the folder that can then be shared only with reviewers so it is not completely public.

      This is a computational paper, and the credibility of the results is based on the code used to generate them.

      The code is available at GitHub as required:

      https://github.com/omermazar/Colony-Exit-Bat-Simulation

      (2) Abstract:

      Line 22: 'To explore whether..' - replace 'whether' with 'how'?

      The sentence was rephrased as suggested by the reviewer.

      (2) Main text:

      Line 43: '...which may share...' - correct to '...which share...', as elegantly framed in the authors' previous work - jamming avoidance is unavoidable because all FM bats of a species still share >90% of spectral bandwidth despite a few kHz shift here and there.

      The sentence was rephrased as suggested by the reviewer.

      Line 49: The authors may wish to additionally cite the work of Fawcett et al. 2015 (J. Comp. Phys A & Biology Open)

      Thank you for the suggestion. We have included a citation to the work of Fawcett et al. (2015) in the revised manuscript.

      Line 61: This statement does not match the recent state of the literature. While the previous models may have assumed that all neighbours can be detected, there are models that specifically study the role of limited interaction arising from the potential inability to track all neighbours, and the effect of responding to only one/few neighbours at a time e.g. Bode et al. 2011 R. Soc. Interface, Jhawar et al. 2020 Nature Physics.

      We have added citations to the important studies suggested by the reviewer, as detailed in the Public Review above.

      Line 89: '..took all interference signals into account...' - what is meant by 'interference signals' - are the authors referring to reflections, unclear.

      We have revised the sentence and detailed the acoustic signals involved in the process: self-generated echoes, calls from conspecifics, and echoes from cave walls and other bats evoked by those calls, see lines 99-106.

      Figure 1A: The colour scheme with overlapping points makes the figure very hard to understand what is happening. The legend has colours from subfigures B-D, adding to the confusion.

      What does the yellow colour represent? This is not clear. Also, in general, the color schemes in the simulation trajectories and the legend are not the same, creating some amount of confusion for the reader. It would be good to make the colour schemes consistent and visually separable (e.g. consp. call direct is very similar to consp. echo from consp. call), and perhaps also if possible add a higher resolution simulation visualisation. Maybe it is best to separate out the colour legends for each sub-figure.

      The updated figure now includes clearer, more visually separable colors, and consistent color coding across all sub-panels. The yellow trajectory representing the focal bat’s flight path is now explicitly labeled, and we adjusted the color mapping of acoustic signals (e.g., conspecific calls vs. echoes) to improve distinction. We also revised the figure caption accordingly and ensured that the legend is aligned with the updated visuals. These modifications aim to enhance interpretability and reduce ambiguity for the reader.

      Figure C3: What is 'FB Channel', this is not explained in the legend.

      FB Channel’ stands for ‘Filter Bank Channel’. This clarification has been added to the caption of Figure 1. 

      Figure 3: Visually noticing that the colour legend is placed only on sub-figure A is tricky and readers may be left searching for the colour legend. Maybe lay out the legend horizontally on top of the entire figure, so it stands out?

      We have adjusted the placement of the color legend in Figure 3 to improve visibility and consistency.

      Line 141: '..the probability of exiting..' - how is this probability calculated - not clear.

      We have clarified in the revised text that the probability of exiting the cave within 15 seconds is defined as the number of bats that exited the cave within that time divided by the total number of bats in each scenario, see lines 159160.

      Line 142: What are the sample sizes here - i.e. how many simulation replicates were performed?

      We have clarified the number of repetitions in each scenario the revised text, as detailed in the Public Review above.

      Line 151: 'The jamming probability,...number of jammed echoes divided by the total number of reflected echoes' - it seems like these are referring to 'own' echoes or first-order reflections, it is important to clarify this.

      The reviewer is right. We have clarified it in the revised text, see lines 173175.

      Line 153: '..with a maximum difference of ...' - how is this difference calculated? What two quantities are being compared - not clear.

      We have revised the text to clarify that the 14.3% value reflects the maximum difference in jamming probability between the RM and PK models, which occurred at a density of 10 bats. The values at each density are shown in Figure 2D, see lines 175-177.

      Line 221: '..temporal aggregation helps..' - I'm assuming the authors meant temporal integration? However, I would caution against using the exact term 'temporal integration' as it is used in the field of audition to mean something different. Perhaps something like 'sensory integration' , or 'multi-call integration'

      To avoid ambiguity and better reflect the process modeled in our work, we have replaced the term "temporal aggregation" with "multi-call integration" throughout the revised manuscript. This term more accurately conveys the idea of combining information from multiple echolocation calls without conflicting with existing terminology.

      (4) Discussion

      Lines 302: 'Our model suggests...increasing the call-rate..' - not clear where this is explicitly tested or referred to in this manuscript. Can't see what was done to measure/quantify the effect of this variable in the Methods or anywhere else.

      We have rephrased this paragraph as detailed in the Public Review above, see lines 346-349.

      Line 319: 'spatial interference' - unclear what this means. This reviewer would strongly caution against creating new terms unless there is an absolute need for it. What is meant by 'interference' in this paper is hard to assess given that the word seems to be used as a synonym for jamming and also for actual physical wave-based interference.

      We have rephrased this paragraph as detailed in the Public Review above, see line 119-120, 366-367.

      Line 323: '..no benefit beyond a certain level...' - also not clear where this is explicitly tested. It seems like there was a set of simulations run for a variety of parameters but this is not written anywhere explicitly. What type of parameter search was done, was it all possible parameter combinations - or only a subset? This is not clear.

      We have rephrased this paragraph as detailed in the Public Review above, see lines 372-375.

      Line 324: '..ca. 110 dB-SPL.' - what reference distance?

      All call levels were simulated and reported in dB-SPL, referenced at 0.1 meters from the emitting bat. We have clarified it in the revised text in the relevant contexts and specifically in line 529.

      (5) Methods

      Line 389 : '...over a 2 x 1.5 m2 area..' It took a while to understand this statement and put it in context. Since there is no previous description of the entire L-arena, the reviewer took it to mean the simulations happened over the space of a 2 x 1.5 m2 area. Include a top-down description of the simulation's spatial setup and rephrase this sentence.

      To address the confusion, we revised the text to clarify that the full simulation environment represents a corridor-shaped cave measuring 14.5 × 2.5 meters, with a right-angle turn located 5.5 meters before the exit, as shown in Figure 1A. The 2 × 1.5 m area refers specifically to the small zone at the far end of the cave where bats begin their flight. The revised description now includes a clearer spatial overview to prevent ambiguity, see lines 456-460.

      Line 398: Replace 'High proximity' with 'Close proximity'

      Replaced.

      Line 427: 'uniform target strength of -23 dB' - at what distance is this target strength defined? Given the reference distance can vary by echolocation convention (0.1 or 1 m), one can't assess if this is a reasonable value or not.

      The reference distance for the reported target strength is 1 meter, in line with standard acoustic conventions. We have revised the text to clarify this explicitly (line 531).

      Also, independent of the reference distance, particularly with reference to bats, the target strength is geometry-dependent, based on whether the wings are open or not. Using the entire wingspan of a bat to parametrise the target strength is an overestimate of the available reflective area. The effective reflective area is likely to be somewhere closer to the surface area of the body and a fraction of the wingspan together. This is important to note and/or mention explicitly since the value is not experimentally parametrised.

      For comparison, experimentally based measurements used in Goetze et al. 2016 are -40 dB (presumably at 1 m since the source level is also defined at 1 m?), and Beleyur & Goerlitz 2019 show a range between -43 to -34 dB at 1 m.

      We agree with the reviewer that target strength in bats is strongly influenced by their geometry, particularly wing posture during flight. In our model, we simplified this aspect by using a constant target strength, as the detailed temporal variation in body and wing geometry is pseudo-random and not explicitly modeled. We acknowledge that this is a simplification, and have now stated this limitation clearly in the revised manuscript. We chose a fixed value of –23 dB at 1 meter to reflect a plausible mid-range estimate, informed by anatomical data and consistent with values reported for similarly sized species (Beleyur and Goerlitz, 2019). To support this, we directly measured the target strength of a 3D-printed RM bat model, obtaining –32dB. 

      Moreover, a sensitivity analysis across a wide range (–49 to –23 dB) confirmed that performance metrics remain largely stable, indicating that our conclusions are not sensitive to this parameter, and suggesting that our results hold for different-sized bats. See lines 384-390, 533-538, and Supplementary Figures 3 and 4 in the revised article. 

      Line 434: 'To model the bat's cochlea...'. Bats have two cochleas. This model only describes one, while the agents are also endowed with the ability to detect sound direction - which requires two ears/cochleas.... There is missing information about the steps in between that needs to be provided.

      We appreciate the reviewer’s observation. Indeed, our model is monaural, and simulates detection using a single cochlear-like filter bank receiver. We have clarified this in the revised text to avoid confusion. This paragraph specifically describes the detection stage of the auditory processing pipeline. The localization process, which builds on detection and includes directional estimation, is described in the following paragraph (see line 583 onward), as discussed in the next comment and response.

      Line 457: 'After detection, the bat estimates the range and Direction of Arrival...' This paragraph describes the overall idea, but not the implementation. What were the inputs and outputs for the range and DOA calculation performed by the agent? Or was this information 'fed' in by the simulation framework? If there was no explicit DOA step that the agent performed, but it was assumed that agents can detect DOA, then this needs to be stated.

      In the current simulation, the Direction of Arrival (DOA) was not modeled via an explicit binaural processing mechanism. Instead, based on experimental studies (Simmons et al., 1983; Popper and Fay, 1995).  we assumed that bats can estimate the direction of an echo with an angular error that depends on the signal-to-noise ratio (SNR). Accordingly, the inputs to the DOA estimation were the peak level of the desired echo, noise level, and the level of acoustic interference. The output was an estimated direction of arrival that included a random angular error, drawn from a normal distribution whose standard deviation varied with the SNR. We have revised the relevant paragraph (Lines 583-592) to clarify this implementation.

      Line 464: 'To evaluate the impact of the assumption...' - the 'self' and 'non-self' echoes can be distinguished perhaps using pragmatic time-delay cues, but also using spectro-temporal differences in individual calls/echoes. Do the agents have individual call structures, or do all the agents have the same call 'shape'? The echolocation parameters for the two modelled species are given, but whether there is call parameter variation implemented in the agents is not mentioned.

      In our relatively simple model, all individuals emit the same type of chirp call, with parameters adapted only based on the distance to the nearest detected object. However, individual variation is introduced by assigning each bat a terminal frequency drawn from a normal distribution with a standard deviation of 1 kHz, as described in the revised version -lines 519-520. This small variation is not used explicitly as a spectro-temporal cue for echo discrimination.

      In our model, all spectro-temporal variations—whether due to call structure or variations resulting from overlapping echoes from nearby reflectors—are processed through the filter bank, which compares the received echoes to the transmitted call during the detection stage. As such, the detection process itself can act as a discriminative filter, to some extent, based on similarity to the emitted call.

      We acknowledge that real bats likely rely on a variety of spectro-temporal features for distinguishing self from non-self-echoes—such as call duration, received level, multi-harmonic structure, or amplitude modulation. In our simulation, we focus on comparing two limiting conditions: full recognition of self-generated echoes versus full confusion. Implementing a more nuanced self-recognition mechanism based on temporal or spectral cues would be a valuable extension for future work.

      (6) References

      Reference 22: Formatting error - and extra '4' in the reference.

      The error has been fixed.

      (7) Thoughts/comments

      Even without 'recogntion' of walls & conspecifics, bats may be able to avoid obstacles - this is a neat result. Also, using their framework the authors show that successful 'blind' object-agnostic obstacle avoidance can occur only when supported by some sort of memory. In some sense, this is a nice intermediate step showing the role of memory in bat navigation. We know that bats have good long-term and long-spatial scale memory, and here the authors show that short-term spatial memory is important in situations where immediate sensory information is unreliable or unavailable.

      We appreciate the reviewer’s thoughtful summary. Indeed, one of the main takeaways of our study is that successful obstacle avoidance can occur even without explicit recognition of walls or conspecifics—provided that a clustered multi-call integration is in place. Our model shows that when immediate sensory information is unreliable, integrating detections over time becomes essential for effective navigation. This supports the broader view that memory, even on short timescales, plays an important role in bat behavior.

      (8) Reporting GLM results

      The p-value, t-statistic, and degrees of freedom are reported consistently across multiple GLM results. However, the most important part which is the effect size is not consistently reported - and this needs to be included in all results, and even in the table. The effect size provides an indicator of the parameter's magnitude, and thus scientific context.

      We agree that the effect size provides essential scientific context. In fact, we already include the effect size explicitly in Table 1, as shown in the “Effect Size” column for each tested parameter. These values describe the magnitude of each parameter’s effect on exit probability, jamming probability, and collision rate. In the main text, effect sizes are presented as concrete changes in performance metrics (e.g., “exit probability increased from 20% to 87%,” or “with a decrease of 3.5%±8% to 5.5%±5% (mean ± s.e.)”), which we believe improves interpretability and scientific relevance.  

      To further clarify this in the main text, we have reviewed the reported results and ensured that effect sizes are mentioned more consistently wherever GLM outcomes are discussed. Additionally, we have added a brief note in the table caption to emphasize that effect sizes are provided for all tested parameters.

      The 'tStat' appears multiple times and seems to be the output of the MATLAB GLM function. This acronym is specific to the MATLAB implementation and needs to be replaced with a conventionally used acronym such as 't', or the full form 't-statistic' too. This step is to keep the results independent of the programming language used.

      We have replaced all instances of tStat with the more conventional term ‘t’ throughout the manuscript to maintain consistency with standard reporting practices.

      Reviewer #2 (Recommendations for the authors):

      In addition to my public review, I had a few minor points that the authors may want to consider when revising their paper.

      (1) Figures 2, 3, and 4 may benefit from using different marker styles, in addition to different colors, to show the different cases.

      Thank you for the suggestion. In Figures 2–4, the markers represent means with standard error bars. To maintain clarity and consistency across all conditions, we have chosen to keep a standardized marker style – and we clarify this in the legend. We found that varying only the colors is sufficient for distinguishing between conditions without introducing visual clutter.

      (2) The text "PK" in the inset for Figure 2A is very difficult to read. I would suggest using grey as with "RM" in the other inset.

      We have updated the insert in Figure 2A to improve legibility.

      (3) Are the error bars in Figure 3 very small? I wasn't able to see them. If that is the case, the authors may want to mention this in the caption.

      You are correct—the error bars are present in all plots but appear very small due to the large number of simulation repetitions and low variability. We have revised the caption to explicitly mention this.

      (4) The species name of PK is spelled inconsistently (kuhli, khulli, and kuhlii).

      We have corrected the species name throughout the manuscript.

      (5) Table 1 is a great condensation of all the results, but the time to exit is missing. It may be helpful if summary statistics on that were here as well.

      We have added time-to-exit to the effect size column in Table 1, alongside the other performance metrics, to provide a more complete summary of the simulation results.

      (6) I may have missed it, but why are there two values for the exit probability when nominal flight speed is varied?

      The exit probability was not monotonic with flight speed, but rather showed a parabolic trend with a clear optimum. Therefore, we reported two values representing the effect before and after the peak. We have clarified this in the revised table and updated the caption accordingly.

      (7) Table 2 has an extra header after the page break on page 18.

      The extra header in Table 2 after the page break has been removed in the revised manuscript.

      (8) The G functions have 2 arguments in their definitions and Equation 1, but only one argument in Equations 2 and 3. I wasn't able to see why.

      Thank you for pointing this out. You are correct—this was a typographical error. We have corrected the argument notation in Equations 2 and 3 and explicitly included the frequency dependence of the gain (G) functions in both equations.

      (9) D_txrx was not defined but it was used in Equation 2.

      The variable D_txrx is defined in the equation notation section as: D<sub>₍ₜₓ</sub>r<sub>ₓ</sub> – the distance [m] between the transmitting conspecific and the receiving focal bat, from the transmitter’s perspective. We have now ensured that this definition is clearly linked to Equation 2 in the revised text. Moreover, we have added a supplementary figure that illustrates the geometric configuration defined by the equations to further support clarity, as described in the Public Review above.

      (10) It was hard for me to understand what was meant by phi_rx and phi_tx. These were described as angles between the rx or tx bats and the target, but I couldn't tell what the point defining the angle was. Perhaps a diagram would help, or more precise definitions.

      We have revised the caption to provide clearer and more precise definitions Additionally, we have included a geometric diagram as a supplementary figure, as noted in the Public Review above, to visually clarify the spatial relationships and angle definitions used in the equations, see lines 498-499.

      (11) Was the hearing threshold the same for both species?

      Yes. We have clarified it in the revised version.

      (12) Collision avoidance is described as turning to the "opposite direction" in the supplemental figure explaining the model. Is this 90 degrees or 180 degrees? If 90 degrees, how do these turns decide between right and left?

      In our model, the bat does not perform a fixed 90° or 180° turn. Instead, the avoidance behavior is implemented by setting the maximum angular velocity in the direction opposite to the detected echo. For example, if the obstacle or conspecific is detected on the bat’s right side, the bat begins turning left, and vice versa.

      This turning direction is re-evaluated at each decision step, which occurs after every echolocation pulse. The bat continues turning in the same direction if the obstacle remains in front, otherwise it resumes regular pathfinding. We have clarified this behavior in the updated figure caption and model description, see lines 478-493.

      Reviewer #3 (Recommendations for the authors):

      (1) Lines 27-31: These sentences mischaracterize the results. This claim appears to equate "the model works" with "this is what bats actually do." Also, the model does not indicate that bats' echolocation strategies are robust enough to mitigate the effects of jamming - this is self-evident from the fact that bats navigate successfully via echolocation in dense groups.

      Thank you for the comment. Our aim was not to claim that the model confirms actual bat behavior, but rather to demonstrate that simple and biologically plausible strategies—such as signal redundancy and basic pathfinding—are sufficient to explain how bats might cope with acoustic interference in dense settings. We have revised the wording to better reflect this goal and to avoid overinterpreting the model's implications.

      See abstract in the revised version.  

      (2) Line 37: This number underestimates the number of bats that form some of the largest aggregations of individuals worldwide - the free-tailed bats can form aggregations exceeding several million bats.

      We have revised the text to reflect that some bat species, such as free-tailed bats, are known to form colonies of several million individuals, which exceed the typical range. The updated sentence accounts for these extreme cases, see lines 36-37.

      (3) The flight densities explained in the introduction and chosen references are not representative of the literature - without providing additional justification for the chosen species, it can be interpreted that the selection of the species for the simulation is somewhat arbitrary. If the goal is to model dense emergence flight, why not use a species that has been studied in terms of acoustic and flight behavior during dense emergence flights---such as Tadarida brasiliensis?

      Our goal was to develop a general model applicable to a broad class of FMecholocating bat species. The two species we selected—Pipistrellus kuhlii (PK) and Rhinopoma microphyllum (RM)—span a wide range of signal characteristics: from wideband (PK) to narrowband (RM), providing a representative contrast in call structure. 

      Although we did not include Tadarida brasiliensis (TB) specifically, its echolocation calls are acoustically similar to RM in terminal frequency and fall between PK and RM in bandwidth. Therefore, we believe our findings are likely to generalize to TB and other FM-bats.

      Moreover, as noted in a previous response, the average inter-bat distance in our highest-density simulations (0.27 m) is still smaller than those reported for Tadarida brasiliensis during dense emergences—further supporting the relevance of our model to such scenarios.

      To support broader applicability, we also provide a supplementary graphical user interface (GUI) that allows users to modify key echolocation parameters and explore their impact on behavior—making the framework adaptable to additional species, including TB.

      (4) Line 78: It is not clear how (or even if) the simulated bats estimate the direction of obstacles. The explanation given in lines 457-463 is quite confusing. What is the acoustic/neurological mechanism that enables this direction estimation? If there is some mechanism (such as binaural processing), how does this extrapolate to 3D?

      This comment echoes a similar concern raised by a previous reviewer. As explained earlier, in the current simulation, the Direction of Arrival (DOA) was not modeled via an explicit binaural processing mechanism. The complete  is detailed in  to Reviewer #1, Line 457. This implementation is now clarified in the revised text, and a detailed description of the localization process is also provided in the Methods section (lines 583-592).

      (5) The authors propose they are modeling the dynamic echolocation of bats in the simulation (line 79), but it appears (whether this is due to a lack of information in the manuscript or true lack in the simulation) that the authors only modeled a flight response. How did the authors account for bats dynamically changing their echolocation? This is unclear and from what I can tell may just mean that the bats can switch between foraging phase call types depending on the distance to a detected obstacle. Can the authors elaborate more on this?

      The echolocation behavior of the bats—including dynamic call adjustments— was implemented in the simulation and is described in detail in the Methods section (lines 498-520 and Table 2). To avoid redundancy, the Results chapter originally referred to this section, but we have now added a brief explanation in the Results to clarify that the bats’ call parameters (IPI, duration, and frequency range) adapt based on the distance to detected objects, following empirically documented echolocation phases ("search," "approach," "buzz"). These dynamics are consistent with established bat behavior during navigation in cluttered environments such as caves.

      (6) Figure 1 C3: "Detection threshold": what is this and how was it derived?

      The caption also mentions yellow arrows, but they are absent from the figure. C4: Each threshold excursion is marked with an asterisk, but there are many more excursions than asterisks. Why are only some marked? Unclear.

      C3: The detection threshold is determined dynamically. It is set to the greater of either 7 dB above the noise level (0 dB-SPL)(Kick, 1982; Saillant et al., 1993; Sanderson et al., 2003; Boonman et al., 2013) or the maximal received level minus 70 dB, effectively applying a dynamic range of 70 dB. This clarification has been added to the Methods section. The yellow arrow has been added.

      C4: Thank you for this important observation. Only peaks marked with asterisks represent successful detections—those that were identified in both the interference-free and full detection conditions, as explained in the Methods. Other visible peaks result from masking signals or overlapping echoes from nearby reflectors, but they do not meet the detection criteria. To keep the figure caption concise, we have elaborated on this process more clearly in the revised Methods section. We added this information to the legend

      (7) Figure 2: A line indicating RM, No Masking is absent

      Thank you for pointing this out. The missing line for RM, No Masking has now been added in the revised version of Figure 2.

      (8) Line 121: "reflected off conspecifics". Does this mean echoes due to conspecifics?

      The phrase "reflected off conspecifics" refers to echoes originating from the bat’s own call and reflected off the bodies of nearby conspecifics. We have clarified the wording in the revised text to avoid confusion

      (9) Line 125: Why are low-frequency channels stimulated by higher frequencies? This needs further clarification.

      The cochlear filter bank in our model is implemented using gammatone filters, each modeled as an 8th-order Butterworth filter. Due to the non-ideal filter response and relatively broad bandwidths—especially in the lower-frequency channels—strong energy from the beginning of the downward FM chirp (at higher frequencies) can still produce residual activation in lower-frequency channels. While these stimulations are usually below the detection threshold, they may still be visible as early sub-threshold responses. Given the technical nature of this explanation (a property of the filter implementation) and it does not influence the detection outcomes, we have chosen not to elaborate on it in the figure caption or Methods.

      (10) Lines 146-150: This is an interesting finding. Is there a theoretical justification for it?

      This outcome arises directly from the simulation results. As noted in the Discussion (lines 359-365), although Pipistrellus kuhlii (PK) shows a modest advantage in jamming resistance due to its broader bandwidth, the redundancy in sensory information across calls—enabled by frequent echolocation—appears to compensate for these signal differences. As a result, the small variations in echo quality between species do not translate into significant differences in performance. We speculate that if the difference in jamming probability had been larger, performance disparities would likely have emerged.

      (11) Line 151: The authors define a jammed echo as an echo entirely missed due to masking. Is this appropriate? Doesn't echo mis-assignment also constitute jamming?

      We agree that echo mis-assignment can also degrade performance; however, in our model, we distinguish between two outcomes: (1) complete masking (echo not detected), and (2) detection with a localization error. As explained in the Methods (lines 500–507), we run the detection analysis twice—once with only desired echoes (“interference-free detection”) and once including masking signals (“full detection”). If a previously detected echo is no longer detected, it is classified as a jammed echo. If the echo is still detected but the delay shifts by more than 100 µs compared to the interference-free condition, it is also considered jammed. If the delay shift is smaller, it is treated as a detection with localization error rather than full jamming. We have clarified this distinction in the revised Methods section.

      (12) Figure 2-E: Detection probability statistics are of limited usefulness without accompanying false alarm rate (FAR) statistics. Do the authors have FAR numbers?

      We understand FAR to refer to instances where masking signals or other acoustic phenomena are mistakenly interpreted as real echoes from physical objects. As explained in the manuscript, we implemented two model versions: one without confusion, and one with full confusion.

      Figure 2E reports detection performance under the non-confusion model, in which only echoes from actual physical reflectors are used, and no false detections occur—hence, the false alarm rate is effectively zero in this condition. In the full-confusion model, all detected echoes—including those originating from masking signals or conspecific calls—are treated as valid detections, which may include false alarms. However, we did not explicitly quantify the false alarm rate as a separate metric in this simulation.

      We agree that tracking FAR could be informative and will consider incorporating it into future versions of the model.

      (13) Line 161: RM bats suffered from a significantly higher probability of the "desired conspecific's echoes" being jammed. What does "desired conspecific's echoes" mean? This is unclear.

      The term “desired conspecific's echoes” refers to echoes originating from the bat’s own call, reflected off nearby conspecifics, which are treated as relevant reflectors for collision avoidance. We have revised the wording in the text for clarity.

      (14) Line 188: Why didn't the size of the integration window affect jamming probability? I couldn't find this explained in the discussion.

      The jamming probability in our analysis is computed at the individual-echo level, prior to any temporal integration. Since the integration window is applied after the detection step, it does not influence whether a specific echo is masked (i.e., jammed) or not. Therefore, as expected, we did not observe a significant effect of integration window size on jamming probability.

      (15) Line 217-218: Why do the authors think this would be?

      Thank you for the thoughtful question. We agree that, in theory, increasing call intensity should raise the levels of both desired echoes and masking signals proportionally. However, in our model, the environmental noise floor and detection threshold remain constant, meaning that higher call intensities increase the signal-to-noise ratio (SNR) more effectively for weaker echoes, especially those at longer distances or with low reflectivity. This could lead to a higher likelihood of those echoes crossing the detection threshold, resulting in a small but measurable reduction in jamming probability.

      Additionally, the non-linear behavior of the filter-bank receiver—including such as thresholding at multiple stages—can introduce asymmetries in how increased signal levels affect the detection of target versus masking signals.

      That said, the effect size was small, and the improvement in jamming probability did not translate into any significant gain in behavioral performance (e.g., exit probability or collision rate), as shown in Figure 3C.

      (16) Line 233: I'm not sure I understand how a slightly improved aggregation model that clustered detected reflectors over one-second periods is different. Doesn't this just lead to on average more calls integrated into memory?

      While increasing the memory duration does lead to more detections being available, the enhanced aggregation model (we now refer to as multi-call clustering) differs fundamentally from the simpler one. As detailed in the Methods, it includes additional processing steps: clustering spatially close detections, removing outliers, and estimating wall directions based on the spatial structure of clustered echoes. In contrast, the simpler model treats each detection as an isolated point without estimating obstacle orientation. These additional steps allow for more robust environmental interpretation and significantly improve performance under high-confusion conditions. We have clarified it in revised text (lines 606-616) and added a Supplementary Figure 2B.

      (17) Table 1: What about conspecific target strength?

      We have now added the conspecific target strength as a tested parameter in Table 1, along with its tested range, default value, and measured effect sizes. A detailed sensitivity analysis is also presented in Supplementary Figure 4, demonstrating that variations in conspecific target strength had relatively minor effects on performance metrics.  

      (18) Figure 3-A: The x-axis is the number of calls in the integration window. But the leftmost sample on each curve is at 0 calls. Shouldn't this be 1?

      “0 calls” refers to the case where only the most recent call is used for pathfinding—without integrating any information from prior calls. The x-axis reflects the number of previous calls stored in memory, so a value of 0 still includes the current call. We’ve clarified this terminology in the figure caption.

      (19) Lines 282-283: This statement needs to be clarified that it is with the constraints of using a 2D simulation with at most 33 bats/m^2. It also should be clarified that it is assumed the bat can reliably distinguish between its own echoes and conspecific echoes, which is a very important caveat.

      We have revised the text to clarify that the results are based on a 2D simulation with a maximum tested density of 33 bats/m². We also now explicitly state that the model assumes bats can distinguish between their own echoes and those generated by conspecifics—an assumption we recognize as a simplification. These clarifications help place the results within the scope and constraints of the simulation. Moreover, as described in the text (and noted in previous response): the average distance to the nearest bat in our simulation is 0.27m (with 100 bats), whereas reported distances in very dense colonies are 0.5m

      (20) Line 294: What is this sentence referring to?

      The sentence refers to the finding that, even under high bat densities, a substantial portion of the echoes—particularly those reflected from nearby obstacles (e.g., 1 m away)—were jammed due to masking. Nevertheless, the bats in the simulation were still able to navigate successfully using partial sensory input. We have clarified the sentence in the revised text to make this point more explicit, see line 333-336.

      (21) Line 302: Was jamming less likely when IPI was higher or lower? I could not find this demonstrated anywhere in the manuscript.

      We agree that the original text was not sufficiently clear on this point. While we did not explicitly test fixed IPI values as a parameter, the model does simulate the natural behavior of decreasing IPI as bats approach obstacles. This behavior is supported by empirical observations and is incorporated into the echolocation dynamics of the simulation. We have clarified this point in the revised text (see Lines 346-351) and explained that while lower IPI introduces more acoustic overlap, it also increases redundancy and improves detection through temporal integration.

      (22) Lines 313-314: This is an interesting assumption, but it is not evident that is substantiated by the references.

      The claim is based on well-established principles in signal processing and bioacoustics. Wideband signals—such as those emitted by PK bats— distribute their energy over a broader frequency range, which makes them inherently more resistant to narrowband interference and masking. This concept is commonly applied in both biological and artificial sonar systems and is supported by empirical studies in bats and theory in acoustic sensing.

      For example, Beleyur & Goerlitz (2019) demonstrate that broader bandwidth calls improve detection in cluttered and jamming-prone environments. Similarly, Ulanovsky et al. (2004) and Schnitzler & Kalko (200) discuss how FM bats' wideband calls enhance temporal and spatial resolution, helping to reduce the impact of overlapping signals from conspecifics. These findings align with communication theory where spread-spectrum techniques improve robustness in noisy environments.

      We agree with the reviewer that this is an important point and we have updated the manuscript to clarify this rationale and cite the relevant literature accordingly – lines 631-363,

      (23) Lines 318-319: What is the justification for "probably"? Isn't this just a supposition?

      We agree with the reviewer’s point and have rephrased the sentence

      (24) Line 320: How does this 63% performance match the sentence in line 295?

      The sentence in Line 295 refers to the overall ability of the bats to navigate successfully despite high jamming levels, highlighting the robustness of the strategy under challenging conditions. The figure in Line 320 (63%) quantifies this performance under the most extreme simulated scenario (100 bats / 3 m²), where both spatial and acoustic interferences are maximal. We have rephrased the text in the revised version (lines 324-327).

      (25) Lines 341-345: It seems like this is more likely to be the main takeaway of the paper.

      As noted in the Public Review above, there is substantial literature supporting the assumption that bats can recognize their own echoes and distinguish them from those of conspecifics (e.g., Schnitzler, Bioscience, 2001; Kazial et al., 2001, 2008; Burnett & Masters, 2002; Chiu et al., 2009; Yovel et al., 2009; Beetz & Hechavarría, 2022). Therefore, we consider our assumption of selfrecognition to be well-supported, at least under typical conditions. That said, we agree that the impact of echo confusion on performance is significant and highlights a critical challenge in dense environments.

      To our knowledge, this is the first computational model to explicitly simulate both self-recognition and full echo confusion under high-density conditions. We believe that the combination of modeled constraints and the demonstrated robustness of simple sensorimotor strategies, even under worst-case assumptions, is what makes this contribution both novel and meaningful.

      (26) Lines 349-350: What is the aggregation model? What is meant by "integration"?

      We have revised the text to clarify that the “aggregation model” refers to a multi-call clustering process that includes clustering of detections, removal of outliers, and estimation of wall orientation, as described in detail in the revised Methods and Results sections.

      (27) Line 354: Again, why isn't this the assumption we're working under?

      As addressed in our response to Comment 25, our primary model assumes that bats can recognize their own echoes—an assumption supported by substantial empirical evidence. The alternative "full confusion" model was included to explore a worst-case scenario and highlight the behavioral consequences of failing to distinguish self from conspecific echoes. We assume that real bats may experience some degree of echo misidentification; however, our assumption of full confusion represents a worst-case scenario.

      (28) Line 382: "Under the assumption that..." I agree that bats probably can, but if we assume they can differentiate them all, where's the jamming problem?

      The assumption that bats can theoretically distinguish between different signal sources applies after successful detection. However, the jamming problem arises during the detection and localization stages, where acoustic interference can prevent echoes from crossing the detection threshold or distort their timing.

      (29) Lines 386-387: The paper referenced focused on JAR in the context of foraging. What changes were made to the simulation to switch to obstacle avoidance?

      While the simulation framework in Mazar & Yovel (2020) was developed to study jamming avoidance during foraging, the core components—such as the acoustic calculations, receiver model, and echolocation behavior—remain applicable. For the current study, we adapted the simulation extensively to address colony-exit behavior. These modifications include modeling cave walls as acoustic reflectors, implementing a pathfinding algorithm, integrating obstacle-avoidance maneuvers, and adapting the integration window and integration processes. These updates are detailed throughout the Methods section.

      (30) Line 400-402: Something doesn't add up with the statement: each decision relies on an integration window that records estimated locations of detected reflectors from the last five echolocation calls, with the parameter being tested between 1 and 10 calls. Can the authors reword this to make it less confusing?

      We have reworded the sentence to clarify that the default integration window includes five calls, while we systematically tested the effect of using 1 to 10 calls, see lines 486-487.

      (31) Line 393: "30 deg/sec" why was this value chosen?

      The turning rate of 30 deg/sec was manually selected to approximate the curvature of natural foraging flight paths observed in Rhinopoma microphyllum using on-board tags. Moreover, in Mazar & Yovel (2020), we showed that the flight dynamics of simulated bats in a closed room closely matched those of Pipistrellus kuhlii flying in a room of similar dimensions. However, in the current simulation, bats rarely follow a random-walk trajectory due to the structured environment and frequent obstacle detection. As a result, this parameter has no meaningful impact on the simulation outcomes.

      (32) Line 412: "Harmony" --- do you mean harmonic? And what is the empirical evidence that RM bats use the 2nd harmonic compared to the 1st?

      Perhaps showing a spectrogram of a real RM signal would be helpful.

      The typo-error was corrected. For reference See (Goldshtein et al., 2025)

      (33) Table 2: Something is incorrect with the table. The first row on the next page is the wrong species name. Also, where are the citations for these parameter values?

      The table header has been corrected in the revised version. The parameter values for flight and echolocation behavior were derived from existing literature and empirical data: Pipistrellus kuhlii parameters were based on Kalko (1995), and Rhinopoma microphyllum parameters were extracted from our own recordings using on-board tags, as described in Goldstein et al. (2025). We have added the appropriate citations to Table 2.

      (34) Line 442: How was the threshold level chosen?

      The detection threshold in each level is set to the greater of either 7 dB above the noise level (0 dB-SPL) or the maximal received level minus 70 dB, effectively applying a dynamic range of 70 dB.

      (35) Line 445: 100 micros: This is about 3cm. The resolution of PK is about 1cm. For RM it's about 10cm. So, this window is generous for PK, but too strict for RM.

      To keep the model simple and avoid introducing species-specific detection thresholds, we selected a biologically plausible compromise that could reasonably apply to both species. This simplification ensures consistency across simulations while remaining within the known behavioral range.

      (36) Line 448: What is the spectrum of the Gaussian noise, and did it change between PK and RM?

      We used the same white Gaussian noise with a flat spectrum across the relevant frequency range (10–80 kHz) for both species. We have clarified this in the revised text in lines 570-572.

      (37) Line 451: 4 milliseconds is 1.3m. Is this appropriate?

      The 4 milliseconds window was selected based on established auditory masking thresholds described in Mazar & Yovel (2020), and supported by (Popper and Fay, 1995) ch. 2.4.5, ((Blauert, 1997),  ch. 3.1 and (Mohl and Surlykke, 1989). These values provide conservative lower bounds on bats’ ability to cope with masking (Beleyur and Goerlitz, 2019). For simplicity, we used constant thresholds within each window, see lines 574-576.  

      (38) Line 452: Citation for the forward and backward masking durations?

      See the  to the previous comment.

      (39) Lines 460-461: This is unclear. How does the bat get directional information? The authors claim to be able to measure direction-of-arrival for each detection, but it is not clear how this is done

      As noted in our response to Reviewer 1 (Comment on Line 457), directional information is not computed via an explicit binaural model. Instead, we assume the bat estimates the direction of arrival with an angular error that depends on the SNR, based on established studies (e.g., Simmons et al., 1983; Popper & Fay, 1995). We have clarified this in the revised text in lines 583-592.

      (40) Line 467: It seems like the authors are modeling pulse-echo ambiguity, at least in this one alternative model, which is good! However the alternative model doesn't get much attention in the paper. Is there a reason for this?

      We would like to clarify that we did not model pulse-echo. In our confusion model, all echoes received within the IPI are attributed to the bat’s most recent call. This includes echoes that may in fact originate from conspecific calls, but the model does not assign self-echoes to earlier pulses or span multiple IPIs. Therefore, while the model captures echo confusion, it does not include true pulse-echo ambiguity. We have clarified this point in the revised text in lines 551-553.

      (41) Line 41: "continuous" is more appropriate than "constant".

      Thank you, we have rephrased the text accordingly.

      (42) Line 69: "band width" should be one word.

      Thank you, we have corrected it to “bandwidth”.

      (43) Line 79: "bats" should be in the possessive.

      Thank you, the text has been rephrased.

      (44) Line 128: "convoluted" don't you mean "convolved"?

      We have replaced “convoluted” with the correct term “convolved” in the revised text.

      (45) Please check your references, as there are some incomplete citations and typos.

      Thank you, we have reviewed and corrected all references for completeness and consistency.

      References

      Beetz, M.J. and Hechavarría, J.C. (2022) ‘Neural Processing of Naturalistic Echolocation Signals in Bats’, Frontiers in Neural Circuits, 16, p. 899370. Available at: https://doi.org/10.3389/FNCIR.2022.899370/BIBTEX.

      Beleyur, T. and Goerlitz, H.R. (2019) ‘Modeling active sensing reveals echo detection even in large groups of bats’, Proceedings of the National Academy of Sciences of the United States of America, 116(52), pp. 26662–26668. Available at: https://doi.org/10.1073/pnas.1821722116.

      Betke, M. et al. (2008) ‘Thermal Imaging Reveals Significantly Smaller Brazilian Free-Tailed Bat Colonies Than Previously Estimated’, Journal of Mammalogy, 89(1), pp. 18–24. Available at: https://doi.org/10.1644/07-MAMM-A-011.1.

      Blauert, J. (1997) ‘Spatial Hearing: The Psychophysics of Human Sound Localization (rev. ed.)’.

      Boerma, D.B. et al. (2019) ‘Wings as inertial appendages: How bats recover from aerial stumbles’, Journal of Experimental Biology, 222(20). Available at: https://doi.org/10.1242/JEB.204255/VIDEO-3.

      Boonman, A. et al. (2013) ‘It’s not black or white-on the range of vision and echolocation in echolocating bats’, Frontiers in Physiology, 4 SEP(September), pp. 1–12. Available at: https://doi.org/10.3389/fphys.2013.00248.

      Boonman, A.M., Parsons, S. and Jones, G. (2003) ‘The influence of flight speed on the ranging performance of bats using frequency modulated echolocation pulses’, The Journal of the Acoustical Society of America, 113(1), p. 617. Available at: https://doi.org/10.1121/1.1528175.

      Burnett, S.C. and Masters, W.M. (2002) ‘Identifying Bats Using Computerized Analysis and Artificial Neural Networks’, North American Symposium on Bat Research, 9.

      Chili, C., Xian, W. and Moss, C.F. (2009) ‘Adaptive echolocation behavior in bats for the analysis of auditory scenes’, Journal of Experimental Biology, 212(9), pp. 1392–1404. Available at: https://doi.org/10.1242/jeb.027045.

      Fujioka, E. et al. (2021) ‘Three-Dimensional Trajectory Construction and Observation of Group Behavior of Wild Bats During Cave Emergence’, Journal of Robotics and Mechatronics, 33(3), pp. 556–563. Available at: https://doi.org/10.20965/jrm.2021.p0556.

      Gillam, E.H. et al. (2010) ‘Echolocation behavior of Brazilian free-tailed bats during dense emergence flights’, Journal of Mammalogy, 91(4), pp. 967–975. Available at: https://doi.org/10.1644/09-MAMM-A-302.1.

      Goldshtein, A. et al. (2025) ‘Onboard recordings reveal how bats maneuver under severe acoustic interference’, Proceedings of the National Academy of Sciences, 122(14), p. e2407810122. Available at: https://doi.org/10.1073/PNAS.2407810122.

      Griffin, D.R., Webster, F.A. and Michael, C.R. (1958) ‘THE ECHOLOCATION OF FLYING INSECTS BY BATS ANIMAL BEHAVIOUR , Viii , 3-4’.

      Hagino, T. et al. (2007) ‘Adaptive SONAR sounds by echolocating bats’, International Symposium on Underwater Technology, UT 2007 - International Workshop on Scientific Use of Submarine Cables and Related Technologies 2007, pp. 647–651. Available at: https://doi.org/10.1109/UT.2007.370829.

      Hiryu, S. et al. (2008) ‘Adaptive echolocation sounds of insectivorous bats, Pipistrellus abramus, during foraging flights in the field’, The Journal of the Acoustical Society of America, 124(2), pp. EL51–EL56. Available at: https://doi.org/10.1121/1.2947629.

      Jakobsen, L. et al. (2024) ‘Velocity as an overlooked driver in the echolocation behavior of aerial hawking vespertilionid bats’. Available at: https://doi.org/10.1016/j.cub.2024.12.042. Jakobsen, L., Brinkløv, S. and Surlykke, A. (2013) ‘Intensity and directionality of bat echolocation signals’, Frontiers in Physiology, 4 APR(April), pp. 1–9. Available at: https://doi.org/10.3389/fphys.2013.00089.

      Jakobsen, L. and Surlykke, A. (2010) ‘Vespertilionid bats control the width of their biosonar sound beam dynamically during prey pursuit’, 107(31). Available at:

      https://doi.org/10.1073/pnas.1006630107.

      Kalko, E.K. V. (1995) ‘Insect pursuit, prey capture and echolocation in pipistrelle bats (Microchirptera)’, Animal Behaviour, 50(4), pp. 861–880.

      Kazial, K.A., Burnett, S.C. and Masters, W.M. (2001) ‘ Individual and Group Variation in Echolocation Calls of Big Brown Bats, Eptesicus Fuscus (Chiroptera: Vespertilionidae) ’, Journal of Mammalogy, 82(2), pp. 339–351. Available at: https://doi.org/10.1644/15451542(2001)082<0339:iagvie>2.0.co;2.

      Kazial, K.A., Kenny, T.L. and Burnett, S.C. (2008) ‘Little brown bats (Myotis lucifugus) recognize individual identity of conspecifics using sonar calls’, Ethology, 114(5), pp. 469– 478. Available at: https://doi.org/10.1111/j.1439-0310.2008.01483.x.

      Kick, S.A. (1982) ‘Target-detection by the echolocating bat, Eptesicus fuscus’, Journal of Comparative Physiology □ A, 145(4), pp. 431–435. Available at: https://doi.org/10.1007/BF00612808/METRICS.

      Kothari, N.B. et al. (2014) ‘Timing matters: Sonar call groups facilitate target localization in bats’, Frontiers in Physiology, 5 MAY. Available at: https://doi.org/10.3389/fphys.2014.00168.

      Mohl, B. and Surlykke, A. (1989) ‘Detection of sonar signals in the presence of pulses of masking noise by the echolocating bat , Eptesicus fuscus’, pp. 119–124.

      Moss, C.F. and Surlykke, A. (2010) ‘Probing the natural scene by echolocation in bats’, Frontiers in Behavioral Neuroscience. Available at: https://doi.org/10.3389/fnbeh.2010.00033.

      Neretti, N. et al. (2003) ‘Time-frequency model for echo-delay resolution in wideband biosonar’, The Journal of the Acoustical Society of America, 113(4), pp. 2137–2145. Available at: https://doi.org/10.1121/1.1554693.

      Popper, A.N. and Fay, R.R. (1995) Hearing by Bats. Springer-Verlag.

      Roy, S. et al. (2019) ‘Extracting interactions between flying bat pairs using model-free methods’, Entropy, 21(1). Available at: https://doi.org/10.3390/e21010042.

      Sabol, B.M. and Hudson, M.K. (1995) ‘Technique using thermal infrared-imaging for estimating populations of gray bats’, Journal of Mammalogy, 76(4). Available at: https://doi.org/10.2307/1382618.

      Saillant, P.A. et al. (1993) ‘A computational model of echo processing and acoustic imaging in frequency- modulated echolocating bats: The spectrogram correlation and transformation receiver’, The Journal of the Acoustical Society of America, 94(5). Available at: https://doi.org/10.1121/1.407353.

      Salles, A., Diebold, C.A. and Moss, C.F. (2020) ‘Echolocating bats accumulate information from acoustic snapshots to predict auditory object motion’, Proceedings of the National Academy of Sciences of the United States of America, 117(46), pp. 29229–29238. Available at: https://doi.org/10.1073/PNAS.2011719117/SUPPL_FILE/PNAS.2011719117.SAPP.PDF.

      Sanderson, M.I. et al. (2003) ‘Evaluation of an auditory model for echo delay accuracy in wideband biosonar’, The Journal of the Acoustical Society of America, 114(3), pp. 1648– 1659. Available at: https://doi.org/10.1121/1.1598195.

      Schnitzler, H., Bioscience, E.K.- and 2001, undefined (no date) ‘Echolocation by insecteating bats: we define four distinct functional groups of bats and find differences in signal structure that correlate with the typical echolocation ’, academic.oup.comHU Schnitzler, EKV KalkoBioscience, 2001•academic.oup.com [Preprint]. Available at: https://academic.oup.com/bioscience/article-abstract/51/7/557/268230 (Accessed: 17 March 2025).

      Schnitzler, H.-U. et al. (1987) ‘The echolocation and hunting behavior of the bat,Pipistrellus kuhli’, Journal of Comparative Physiology A, 161(2), pp. 267–274. Available at: https://doi.org/10.1007/BF00615246.

      Simmons, J.A. et al. (1983) ‘Acuity of horizontal angle discrimination by the echolocating bat , Eptesicus fuscus’. Simmons, J.A. and Kick, S.A. (1983) ‘Interception of Flying Insects by Bats’, Neuroethology and Behavioral Physiology, pp. 267–279. Available at: https://doi.org/10.1007/978-3-64269271-0_20.

      Surlykke, A., Ghose, K. and Moss, C.F. (2009) ‘Acoustic scanning of natural scenes by echolocation in the big brown bat, Eptesicus fuscus’, Journal of Experimental Biology, 212(7), pp. 1011–1020. Available at: https://doi.org/10.1242/JEB.024620.

      Theriault, D.H. et al. (no date) ‘Reconstruction and analysis of 3D trajectories of Brazilian free-tailed bats in flight’, cs-web.bu.edu [Preprint]. Available at: https://csweb.bu.edu/faculty/betke/papers/2010-027-3d-bat-trajectories.pdf (Accessed: 4 May 2023).

      Ulanovsky, N. and Moss, C.F. (2008) ‘What the bat’s voice tells the bat’s brain’, Proceedings of the National Academy of Sciences of the United States of America, 105(25), pp. 8491– 8498. Available at: https://doi.org/10.1073/pnas.0703550105. Vanderelst, D. and Peremans, H. (2018) ‘Modeling bat prey capture in echolocating bats : The feasibility of reactive pursuit’, Journal of theoretical biology, 456, pp. 305–314.

      Yovel, Y. et al. (2009) ‘The voice of bats: How greater mouse-eared bats recognize individuals based on their echolocation calls’, PLoS Computational Biology, 5(6). Available at: https://doi.org/10.1371/journal.pcbi.1000400.

      Yovel, Y. and Ulanovsky, N. (2017) ‘Bat Navigation’, The Curated Reference Collection in Neuroscience and Biobehavioral Psychology, pp. 333–345. Available at: https://doi.org/10.1016/B978-0-12-809324-5.21031-6.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Based on the below reviews, we propose the following revision plan. Briefly:

      • We will re-focus the manuscript on the developmental data providing a molecular and cellular blueprint __of lining macrophage development. The __novelty and relevance of our developmental data have been highlighted by all three reviewers, and they have also praised the rigor of these experiments and their interpretation. We thus believe that this re-focus will improve the manuscript's message.
      • We will include our data on CSF1 as a key signal. Whilst previously appreciated as a factor required for tissue-resident macrophages, including those in the joint, our study is the first to show the requirement of lining macrophages over a complete developmental time course, using modern readouts, and in a model that circumvents the limitations of previously used approaches (see point-by-point response for details).
      • However, we will remove the functional data on TGFβ signaling and mechanical loading/mechanosensing. We agree with the reviewers that we would need to generate additional histological and molecular data from conditional knockout mice, antibody and (ant)agonist treatments and the optogenetic model to determine their exact involvement in lining macrophage maturation. These experiments require significant time and other resources. We would therefore like to uncouple this question for a follow-on manuscript, and to re-focus the current study as a developmental atlas. Removal of (some) of these data has been suggested in the reviewers' comments as well.
      • To further elevate our developmental atlas, we are proposing to include additional data and new analyses delineating the developmental dynamics of synovial fibroblasts on single cell (transcriptomic) level. This change to the original manuscript had not been requested by the reviewers, but we are proposing this pro-actively because we believe this would be an impactful addition to a revised version of our study, providing data also on the maturation of the synovial (lining) macrophage niche. Again, this will re-focus the manuscript on the developmental data and provide a novel, valuable resource for those interested in joint biology.
      • We will otherwise respond to all individual reviewer comments and implement the requested changes, unless technically not possible. We are convinced that this revision plan will result in a manuscript that fits very well with the remit of Genes & Development.

      Please find below detailed point-by-point answers.

      Reviewer #1

      Evidence, reproducibility and clarity

      In their manuscript entitled "The synovial lining macrophage layer develops in the first weeks of life in a CSF1- and TGFβ-dependent but monocyte-independent process," the authors explore the developmental trajectory of synovial lining macrophages. They demonstrate that the formation of this specialized macrophage layer is age-dependent and governed by a distinct developmental program that proceeds independently of circulating monocytes. Through scRNA-Seq, the authors show that synovial lining macrophages originate locally from Aqp1⁺ macrophages and are marked by the expression of Csf1r, Tgfbr, and Piezo1. Notably, genetic ablation of each of these factors impaired the development of lining macrophages to varying degrees, suggesting differential contributions of CSF1, TGFβ, and PIEZO1 signaling pathways to their maturation and maintenance.

      The manuscript is well written, and the data quality and representation is of a high standard. The authors have employed a sophisticated array of state-of-the-art mouse models and cutting-edge technologies to elucidate the developmental origin of synovial lining macrophages. Notably, the supporting scRNA-Seq datasets are of excellence and provide valuable insights that will likely be of significant interest to researchers in the field of immunology and joint biology. Accordingly, the experimental approach and interpretations regarding macrophage origin are well-founded and compelling. However, in the eye of the reviewer, the section addressing the underlying molecular mechanisms is a bit less convincing. This part of the study appears slightly underdeveloped, and some of the mechanistic claims lack sufficient experimental clarity. A more rigorous experimental investigation would be essential to reinforce the manuscript's conclusions, particularly concerning the data related to Tgfbr and Piezo1, where the current evidence appears insufficiently substantiated.

      We thank the reviewer for their positive and constructive evaluation of our manuscript. We agree with them (and the other reviewers) that our functional data on the involvement of TGFβ signaling and mechanical loading/mechanosensing are comparably less convincing and substantiated than our developmental data. We are very grateful for their (and the other reviewers') suggestions to provide more support for the involvement of these factors in lining macrophage development. However, we think that carrying this out to the same high standard will require substantial time and other resources. We have therefore decided to uncouple this from the developmental data and pursue this in follow-up work. We will re-focus the current manuscript on the developmental data. We have proposed to the editors to instead include additional data on synovial fibroblast development, to complement our macrophage data and also delineate the maturation of their niche, thereby providing a conclusive developmental atlas.

      Major point:

      1. The numbers of VSIG4⁺ macrophages appear either unaffected or only minimally altered in both Csf1rMerCreMer Tgfbr2floxed and Fcgr1Cre Piezo1floxed mouse models, respectively. This raises an important question: was the gene deletion efficiency sufficient in each model? Accordingly, the authors are encouraged to include quantitative data on gene deletion efficiency for both mouse models, as this information is critical for interpreting the observed phenotypic outcomes and validating the conclusions regarding gene function. Furthermore, to better assess the impact of Tgfbr2 and Piezo1 disruption, the authors should provide more comprehensive flow cytometry analyses and histological data for these mouse models. Given the apparent homogeneity of VSIG4⁺ macrophages (as shown by the authors themselves), bulk RNA-Seq of sorted Tgfbr2- and Piezo1-deficient VSIG4⁺ macrophages (or from TGFβ-treated animals) would offer valuable insights into both the effectiveness of gene deletion and the molecular pathways governed by TGFβ and PIEZO1 in lining macrophages.

      As outlined above, we have decided to uncouple our functional data on TGFβ, Piezo1 and mechanical loading. The points raised here are all very valid, and we will implement your suggestions in our follow-up functional work focusing on signaling events regulating lining macrophage development. On the suggestion to perform bulk RNA sequencing for VSIG4+ macrophages: This is a good one in principle - although we will not be able to use this strategy where we want to assess the consequences of experimental treatments or genetic models on lining macrophage maturation, because acquisition of VSIG4 is a key maturation event that might be impaired in these conditions.

      Minor points:

      Consistent usage of Cx3cr1-GFP+ nomenclature (for instance: Fig. S1 legend "adult mouse synovial tissue, showing PDGFRα⁺ fibroblasts (yellow) and CX3CR1-GFP⁺ cells (cyan)." versus Fig. 1 legend "Automated spot detection highlights Cx3cr1-GFP⁺ macrophages)".

      We will implement these changes.

      Unclear Fig. 3 legend: "Representative immunofluorescence images of synovial tissue from Clec9aCre:Rosa26lsl-tdT mice at 3 weeks and in adulthood, showing and tdTomato (yellow) and stained for DAPI (blue), VSIG4 (cyan)" Check 'showing and tdTomato.'

      We will implement these changes.

      For greater clarity, it would have been helpful if the transcript names had been directly included within Figures 3C, S3A, and S3C.

      We will implement these changes.

      Page 24: "(Mki67CreERT2:Rosa26lsl-tdT)" Last bracket not superscript.

      We will implement these changes.

      Page 25: "we again leveraged our scRNAsequencing dataset" Missing punctuation.

      We will implement these changes.

      Page 27: Fig. 5C legend: " of synovial tissue of 1 week-old, 3 weeks-old and adult mice." Please specify and change to 'adult Csf1rΔFIRE/ΔFIRE mice'.

      We will implement these changes.

      Page 30: The outcome observed in the Acta1-rtTA:tetO-Cre:ChR2-V5fl mouse model appears to be inconclusive: "This approach resulted in an increased density of VSIG4+ and total (F4/80+) macrophages in the exposed leg of some 5 days-old pups, but others showed the opposite trend (Figure S5D)." This variability may reflect low efficiency of the model or other technical limitations (e.g. muscle contractions frequency or time point of analysis). Given this ambiguity, it is worth reconsidering whether the data are sufficiently robust to warrant inclusion. Should the authors choose to include these findings, further experimentation of appropriate depth and precision is required to allow a conclusive interpretation (either it increases the density of VSIG4+ macrophages or not). The same applies to the Yoda1-treated mice, for which additional data are needed to determine whether VSIG4⁺ macrophage density is truly affected.

      We have decided to remove the data on the optogenetic mouse model and Yoda1 treatment and follow-on separately, implementing these suggestions, including proof of concept data for optogenetically induced muscle contractions.

      Significance

      General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed? This is a well-designed study that uses cutting-edge methodologies to investigate the developmental trajectory of synovial lining macrophages under homeostatic conditions. The authors present robust experimental evidence and compelling interpretations concerning synovial macrophage origin, which are both well-substantiated and impactful. Nonetheless, from the reviewer's perspective, the section exploring the molecular mechanisms underlying macrophage differentiation is comparatively less convincing. This section appears somewhat underdeveloped, as some of the mechanistic claims lack sufficient depth and experimental rigor to fully substantiate the conclusions.

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field: In contrast to earlier studies (PMID: 31391580, 32601335), the inclusion of fate-mapping experiments adds an important dimension, offering novel insight into the ontogeny of synovial macrophages. This expanded perspective may prove particularly valuable in advancing our understanding of joint immunology, especially regarding the local origins and lineage relationships of macrophage populations.

      Furthermore, the authors present novel insights into the molecular pathways underlying the differentiation and development of synovial lining macrophages. By demonstrating previously unrecognized regulatory mechanisms, this work significantly deepens our understanding of the cellular and transcriptional programs that drive macrophage specialization within the joint microenvironment.

      Place the work in the context of the existing literature (provide references, where appropriate): This study builds upon previous work characterizing the macrophage compartment in the joint (PMID: 31391580, 32601335), yet provides a substantially more comprehensive dataset that spans multiple developmental time points and data on the origin of this specialized macrophage subset.

      State what audience might be interested in and influenced by the reported findings: Immunologist, clinicians

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. This study falls well within the scope of the reviewer's expertise in innate immunity.

      Reviewer #2

      Evidence, reproducibility and clarity

      In the manuscript „The synovial lining macrophage layer develops in the first weeks of life in a CSF1- and TGFβ- dependent but monocyte-independent process", Magalhaes Pinto and colleagues carefully employ a wide range of technologies including single cell profiling, imaging and an exceptional combination of fate mapping models to characterize the ontogeny and development of lining macrophages in the joint, thus dissecting their maturation during postnatal development. Over the last decade, several landmark studies highlighted the imprinting of tissue-resident macrophages by a combination of ontogenetic and tissue-specific niche factors during development. So far, the ontogeny and the tissue niche factors governing the development and maturation of lining macrophages have not been described. Therefore, the results of this study offers insights on a small highly adapted macrophage population with relevance in many disease settings in the joint. Furthermore, the findings are nicely showcasing how macrophages are specializing to even very small tissue niches across development within one bigger anatomical compartment to serve dedicated functions within this niche.

      This manuscript is beautifully written and highlights many novel, highly relevant findings on lining macrophage biology and the authors employ a wide range of different technologies to carefully dissect the postnatal development of lining macrophages.

      In particular, the combination of scRNA-seq and fate mapping is providing a unique the link of transcriptional programs to ontogeny within the tissue niche. Furthermore, the integrative use of distinct fate mapping strategies, transgenic mouse lines, and treatment paradigms to elucidate key niche factors guiding the development and maturation of lining macrophages provides many interesting findings and data that are highly relevant to the field. I really enjoyed reading this manuscript.

      Thank you for your complimentary and constructive assessment of our manuscript, and the detailed comments below, which are very helpful. Please find point-by-point responses below.

      Major points:

      The authors show dynamic regulation of VSIG4 in lining macrophages during development, therefore VSIG4 is maybe not an ideal choice for gating strategies to define lining macrophages or to show as a single markers in immunofluorescence (IF) stainings to demonstrate their abundance across development (even though it is clear that this is the reason why the F4/80 staining is shown next to it). To demonstrate the increase of lining macrophages during development in IF, it would be more helpful if the authors would show quantifications of all F4/80+ cells and additionally VSIG4+ as a proportion of F4/80+ cells (or VSIG4+ F4/80+ and all F4/80+ in a stacked bar plot). We agree with the assessment of VSIG4 not being ideal since this is a key marker of mature lining macrophages only.

      We agree with the assessment of VSIG4 not being ideal since this is a key marker of mature lining macrophages only. We will provide additional data and analyses.

      In Figure 1C, the authors nicely demonstrate that the lining macrophages get closer in their distance across development to build the epithelial-like macrophage structure along the adult lining. Is the close proximity between lining macrophages already fully "matured" at 3 weeks of age and comparable to adults? Please quantify the distance in adult linings.

      We will provide additional data for adult joints.

      Can the authors explain how the grouping was performed between the analyzed human fetal joints? It is not clear why the cut was chosen between the groups at 16/17 weeks of age. Maybe it would be also beneficial if the authors would consider not grouping these samples but rather show the specific quantifications for each samples individually and estimate via linear regression the expansion over time across human development. Furthermore, can the authors give additional information about the distancing of lining macrophages in the human fetal samples, it would be great to see if they follow the same dynamics as in mouse. Maybe comparison to human juvenile/adult joints would also add on to substantiate the findings in human samples (if possible).

      We will show samples ungrouped and perform new linear regression analysis as suggested.

      The scRNA-seq analysis leaves several questions open and some conclusions and workflows cannot be easily followed.

      We appreciate this comment and the complexity of the data, and will implement the below recommendations, and clarify the issues raised. Detailed:

      a. It is not clear how and especially why the signature genes to define macrophages vs. monocytes were chosen. Especially as the signature genes for monocytes would not include patrolling monocytes and the macrophage signature genes seem to be highly regulated during development, see also Apoe expression in NB vs. adult in Figure S2e. Why did the authors not take classical markers such as Itgam, Fcgr1a, Csf1r?

      We will include new analyses using these markers.

      b. Can dendritic cell signatures be excluded? Cluster 11 and 12 show indeed some DC markers, are these really macrophages?

      We will include new analyses to account for DC markers.

      c. The authors provide several figure panels showing TOP marker genes or key marker genes for the identified clusters, however it is not clear if these are TOP DE genes or if the genes were hand chosen. Somehow, the authors give the impression that the clusters were chosen and labeled not based on DE genes, but more on existing literature that previously reported these macrophage populations. DE gene lists for all annotated cell types and macrophage clusters need to be provided within the manuscript.

      We will provide the full DEG analysis results.

      d. The authors claim that Clusters 1 and 4 are "developing" macrophages. How is this defined? Why are these developing cells compared to other clusters? And why are these clusters later on not considered as progenitors of Aqp1 macrophages and Vsig4 macrophages? Why are Aqp1+ macrophages not labeled as developing when they are later on in the manuscript shown as potential intermediate progenitors of lining macrophages?

      As per below comment, we will expand on this and clarify nomenclature and (potential) relationships between these and other macrophages.

      e. Furthermore, it is again confusing that markers are used throughout Figure 2 which are labeled as "key marker genes" for a population and then later on they are claimed to be regulated during development within this population, see for example Figure 2D and 2H.

      We will clarify this as per above answer.

      f. It is appreciated that the authors distinguished cycling clusters such as 8, 9, and 10 based on their cycling gene signature. Here it would be very exciting to see a cell cycle analysis across all clusters and time points to see when exactly the cells are expanding during development; this would also substantiate the data later shown for the Mki67-CreERT2 mouse model.

      We will perform the proposed cell cycle analysis, and implement this and the other reviewer's suggestions for marker selection and cluster annotation (this is also covered in below comments from other reviewers).

      g. Can the authors identify certain gene modules during development of lining macrophages (and/or their progenitors) which are associated with certain functions (e.g. GO terms, GSEA enrichment)?

      This will be included in the revised manuscript.

      To determine the actual presence of the identified macrophage clusters from the scRNA-seq as macrophage populations in the joint, the authors should perform IF or FACS for key markers. Especially, Aqp1+ macrophages should be shown in the developing joint.

      We will provide additional data on Aqp1+ macrophages in the developing joint, and related these to a study by collaborators currently in revision at Immunity, which characterizes the Aqp1+ population in detail (we are hoping to have a doi available during our revision process).

      The authors used a wide range of fate mapping models, which is quite unique and highly appreciated. The obtained results and the conclusions made from the models raise a couple of questions: Whereas contribution of HSC-derived/monocyte-derived macrophages to the lining compartment seems to be minor, there is still labeling across different models. Various aspects would need to be clarified.

      We will clarify these data throughout as per below suggestions.

      a. For example, the authors employ Ms4a3-Cre as a tracing model for GMP-derived monocytes, however all quantifications of the labeling efficiency are not normalized to the labeling in monocytes or another highly recombined cell population. This should be shown, similar to the other fate mapping models (Figure 3 F-I).

      Labelling efficacy for Ms4a3-Cre is near complete for GMP-derived monocytes (and neutrophils) with the Rosa-lsl-tdT (aka Ai14) reporter we have used (see also PMID: 31491389 and doi: 10.1101/2024.12.03.626330); but we will include normalized data as requested.

      b. Please show Ms4a3 expression across clusters across time points, to exclude expression in fetal-derived clusters.

      We will include this in the revised supplementary information, but there is indeed very little at birth (in line with the original report for other tissues PMID: 31491389).

      c. In line with the question raised above, if the authors can exclude a development of the Egfr1+ and Clec4n+ developing macrophages into Aqp1+ macrophages and subsequently into Vsig4 lining macrophages, the obtained data from the Ms4a3-Cre model highly suggests a correlative labeling across these clusters what could implicate a relation. However, the authors do not discuss throughout the manuscript the role of these developing macrophages. It is highly encouraged to include this into the manuscript and it would be of high relevance to understand lining macrophage development.

      This is an interesting point and we agree it deserves consideration in the revised manuscript. Indeed, our trajectory analyses do not predict differentiation of the Egfr1+ and Clec4n+ developing macrophages into Aqp1+ macrophages, and hence, ultimately lining macrophages. Conversely, Aqp1+ cells might also convert into Egfr1+ and Clec4n+ developing macrophages. We will elaborate on this more in the revised manuscript.

      d. The authors conclude from the pseudo bulk transcriptomic profiling of the different macrophage clusters that TdT+ and TdT- macrophages do not differ in their gene expression profile and that this is due to niche imprinting rather than origin imprinting. Even though the data supports that conclusion, the authors should verify if inkling cells early during development also show this similar gene expression profile and gene expression should be compared at the different developmental time points. Tissue niche imprinting is happening within the niche during development, most likely in a stepwise progress, and therefore there should be differences in the beginning.

      This is another important point that we will address in the revised manuscript by performing additional differential gene expression analyses at the different developmental time points, including the earliest stages, as suggested.

      The trajectorial analysis using different pseudotime pipelines is very interesting and nicely points out the potential role of Aqp1 macrophages as intermediates of Vsig4 lining macrophages. From my point of view, all trajectories seem to suggest that Egfr1 developing macrophages and Clec4n developing macrophages might differentiate into Aqp1 macrophages, however the authors are not exploring this further and the role of both developing macrophage clusters is not further discussed (see also comments above).

      We will address and discuss this in the revised manuscript.

      How was the starting point of the trajectorial analyses defined and is it the same for each pipeline used?

      We will clarify this in the revised manuscript.

      Are there potentially two trajectories? It looks like there is one in the beginning of postnatal life and a second one appearing from the monocyte-compartment later in life. If this is true, that would rather speak for a dual ontogeny of Vsig4+ macrophages, wouldn't it?

      We will discuss this in the revised manuscript.

      A heatmap (transcriptional shift) of trajectories between more clusters should be shown at least for Cluster 0,1,2, and 3. It is not sufficient to demonstrate this only between two clusters.

      We will add these analyses during revision.

      To show the similarity between Aqp1 macrophages and proliferating macrophage clusters, the authors should remove the cycling signature and compare these clusters to show that the cycling cells might be Aqp1 macrophages or earlier developing macrophage progenitors aka Clec4n or Egfr1 macrophages.

      We will address this in the revised manuscript.

      The conclusions made from the Mki67-CreERT2 data are a bit difficult to understand, whereas all progenitors (monocyte progenitors and macrophage progenitors will proliferate at the neonatal time point and no conclusions can be made if the cells expand in the niche. The authors should employ Confetti mice or other models (Ubow mice) to analyze clonal expansion in the niche.

      We acknowledge that interpretation of the Mki67-CreERT2 data is complicated by labeling of other cells, and notably, labeling observed in BM-derived cells. To complement the Mki67-CreERT2 data, and specifically account for proliferation of BM-derived cells, we have tried using Ms4a3-Cre:Ubow mice to quantify expansion of the few monocyte-derived macrophages in the joint (lining). However, this yielded

      All predicted cell-cell interactions between macrophages and fibroblasts should be provided in a supplementary table. Are the interactions shown in Figure 5 chosen interactions or the TOP predicted ones? Whereas the authors show different numbers of interactions, it is most likely hand-picked and therefore biased.

      We will provide a full list of all predicted interactions in the revised supplementary material in addition to a list of the full differential gene expression analysis.

      The authors further aim to dissect the factors involved in the developmental niche imprinting of lining macrophages. Even though it is highly appreciated that the authors used so many experimental setups to show the reliance of lining macrophages on Csf1 and TGF-beta as well as mechanosensation, the wide range of models the different methods used and selected developmental time points make it very difficult to really interpret the data. The authors should carefully choose time points and methods (either FACS analysis across all models or IF across all, or both). Often deletion efficiencies for transgenic models and proof of concept that the inhibitors and agonists are working in the treatment paradigm are not provided. For example, Csf1rMer-iCre-Mer Tgfbr2fl/fl mice are used but no deletion efficiency is shown or different time points of analysis, maybe the macrophages are not properly targeted in the set up.

      We have decided to uncouple our experimental data on Tgfb, Piezo1 and mechanosensing/mechanical loading, but are taking this into consideration for revision. In many cases, we have in fact performed flow cytometry and imaging analyses, and agree, we should be showing this consistently.

      The authors have shown the role of Csf1 and Tgfbr2 only for lining macrophages, is this specific in the joint to this population of are subliming macrophages affected in a similar manner.

      We will include data on sublining macrophages in the revised figure (for CSF1; Tgfb data will be uncoupled from this current manuscript).

      Can the authors confirm their results in CSF1R-FIRE mice with anti-Csf1 injections or in Csf1op/op mice?

      We will expand our discussion of the Csf1 findings, and aim to include data for anti-CSF1 antibody treatment during revision. Csf1 has previously been reported as a key factor required for maintenance of tissue-resident macrophages, including those in the joint (lining). Indeed, Csf1op/op mice are deficient in synovial lining macrophages, from 2 days of age onwards (PMID: 8050349), and lining macrophages are also absent from 2-weeks-old and adult Csf1r-/- mice (PMID: 11756160). However, a full developmental analysis has not been performed. We are thus the first to show a full developmental time course, using state-of-the-art experimental readouts, and specifically focusing on the early postnatal window of lining maturation that we have identified here in this study. Moreover, we have used a more specific model, Csf1rFIRE ko, in which Csf1 deficiency is restricted to myeloid cells. This model circumvents issues with other models, which show many developmental defects, some of which unrelated to macrophages. These include growth retardation and skeletal defects, which may influence joint macrophage development. Therefore, although Csf1 dependence of synovial lining macrophage had indeed been previously reported in principle, our data substantially expand on and solidify these findings, thereby adding novelty.

      The setup in Figure S5G is very interesting to test the role of movement and mechanical load on the joint, however, there is basically no data on the model provided showing the efficiency of the induced optogenetic muscle contractions, and only one time point is shown.

      Data on mechanical loading will be uncoupled from the current manuscript and substantiated in a separate follow-up.

      The results regarding the role of Piezo1 and mechanosensation vary a lot. Could it be that analyses were done too early or that actually proper weight load on the joint must be applied for the maturation of the macrophages? The authors should test this to.

      We will uncouple these data from the current manuscript during revision in order to investigate the contribution of these (and other) factors in sufficient detail. However, this is a possibility that we have discussed. In fact, the most appropriate experimental approach to address the involvement of mechanical loading, onset of walking and specifically, weight bearing would be a loss-of-function approach (i.e. paralysis at the newborn stage), for which we unfortunately could not obtain ethics approval from the UK Home Office.

      The Rolipram experiment is shown in Figure S5G, but is not described in the result section. It only appears at some point in the discussion part. The authors should move it to results or remove it from the manuscript.

      We will incorporate these data with the revised section on developing synovial macrophage populations.

      Minor points:

      Please reference the Figure panels in numeric order throughout the text.

      We will change this where not the case already.

      Figure 2a and 2b are a bit out of the storyline, it is not obvious why this is shown here and maybe it would be good to move it to the supplements. Gating strategy is also not used for scRNA-seq. Therefore, it would better fit to the later analysis of joint macrophages across different transgenic mouse models and treatment paradigms. The gating strategies are changing across different experiments throughout the figures, it would be nice to have a similar gating strategy for all experiments, see also Figure 3 where the defining markers for joint macrophages are changing between models.

      We will revise Figures 2, 3 and the related supplementary figures.

      A lot of figure panels have very small labeling that is basically unreadable. Axes at FACS plots for example. Sometimes, it is even impossible to distinguish cluster labels especially when they have similar colors.

      We will revise this, thanks for pointing it out.

      In the text on page 14, many markers are named which are specifically regulated during development in lining macrophages, but these factors are not labeled anywhere in the volcano plot. It would be good to showcase at least some of these named genes in the figure panel, e.g. Trem2.

      We will do this for revision.

      Figure 2F and Figure S2F are really nicely showing the percentage of cells per cluster in each analyzed biological sample. Maybe the authors could additionally consider to show a stacked bar plot with the mean percentage of cells per cluster and how the clusters are distributed across time points?

      We will include this in the revised manuscript.

      Figure 3A: IF for adult lining macrophages and the quantification are missing.

      This will be included in the revised version.

      Reviewer #3 - Major

      Generally, the story could be more streamlined by introducing earlier reporter lines and lineage-origin logic. Clearly state which reporter/CreERT2 lines and acrosses are used. It was unclear in Figure 2 that cells of the cross of the Cx3cr1-GFP and Ms4a3Cre:Rosa26lsl-tdT reporter lines were used for the scRNA-seq. The principle that there are fetal-derived and bone marrow (GMP)-derived monocytes and macrophages doesn't need to be "hidden" until Figure 3. For example, also the imaging of Ms4a3Cre could be introduced before the scRNA-seq.

      We will revise the structure and order of the manuscript during revision. However, we will streamline this between reviewer comments, and would also like to point out that the 2 other reviewers were very complimentary about the writing and clarity, i.e. we may not follow every specific suggestion of reviewer 3, but are very much taking on board their overall comment on structure and clarity.

      Figure 1 could benefit from a cartoon visualizing the anatomy of the knee joint. The terms "sublining" and "synovium" are now a bit unclear, as it appears that sometimes the synovium is indicated as sublining and vice versa. Additionally, a schematic developmental timeline could be added to indicate the parallels between mouse and human development (fetal and postnatal development in mouse versus gestational age in human). Also, the various waves of hematopoiesis could be indicated in this timeline, which would be particularly helpful for Figure 3 for the lineage-tracing readouts. Lastly, the authors could end the manuscript (a new Figure 6) with a general cartoon summarizing all the results presented.

      We will include these illustrations as suggested.

      Figure 1 could be rearranged: first introduce the markers CX3CR1 and VSIG4 (Figure 1D) and then present the quantifications (Figure 1B/E). Where possible, co-visualization CX3CR1-GFP and VSIG4 on tissue sections to strengthen the claims on the relationship between these 2 markers. Tying the scRNA-seq insights (Figure 2) to the imaging would be elegant. Moreover, it would be informative to represent the CX3CR1+ and VSIG4+ macrophages as a percentage of F4/80+ macrophages (Figure 1B/E). Similarly, for the flow cytometry data in Figure 2, the relationship between the markers CX3CR1 and VSIG4 on macrophages could be more clearly displayed and discussed.

      Thanks for this remark. We will endeavour to show co-localization and analysis of both markers wherever possible. However, where we did not use Cx3cr1gfp mice, co-staining was limited by antibody choice and availability.

      The 3D imaging of the joint is a nice addition to the manuscript, as it provides more context to the anatomical structure; however, while the text suggests several newborn joints were imaged, Figure 1F visualizes (again) the knee joint. Could other joints also be represented by 3D imaging? If the knee joint is the only joint available for imaging, and previous confocal imaging focused specifically on the meniscus in the knee joint, could the meniscus also be highlighted in the lightsheet imaging?

      Apologies if this was not clear from the original manuscript text, but we have only imaged the knee joint in 3D. We will clarify this during revision. Whilst we want to maintain the focus on knee joints throughout this manuscript, but we will include additional 3D lightsheet imaging data from micro-dissected knee joints to further substantiate the original data.

      Clarification is requested regarding the imaging quantification representation. The M&M section under "Statistical analysis and reproducibility" states that individual data points are displayed, and bars represent the mean. However, some of the Figure legends (e.g., Figures 1B and S1C) specify that each dot corresponds to an individual mouse, with quantification based on 2-3 sections per mouse. While this appears to be a very reasonable representation of the data, does this mean that for each dot, the mean value from the 2-3 sections per mouse was calculated and plotted?

      We will clarify this.

      It is not clear how the differential expression analysis was performed on the Vsig4+ cells. Please specify if Cluster 0 was used for analysis, or all Vsig4-expressing cells? Not all cells in Cluster 0 have Vsig4+ expression. The authors described the expression dynamics of Aqp1 as intriguing, but lack a reasoning on why this is interesting.

      We will revise this section.

      Figure S3E: In line with the previous comment, can the authors justify that the tdTomato+/- comparisons are not biased by scRNA-seq dropout (scRNA-seq is zero-inflated, so some tdTomato- cells could be false negatives), and provide methodological details (thresholds, ambient RNA correction, etc.) to support this?

      We will clarify this and include additional representations of the tdTomato transcript data.

      Although the sex-related differences in macrophage composition and the absence of differential expression are interesting, they distract from the manuscript's main messages. Moreover, the Discussion does not elaborate on how these observations relate to joint (disease) biology. Consider removing this section or integrating it clearly into the relevant biological context.

      We will remove this section as suggested.

      CreERT2 transgenic lines are often not 100% efficient in recombination, also depending on whether tamoxifen or 4-OHT is used. Could the authors report the percentage of tdTomato+ cells in the joints and compare them to the recombination efficiencies in the monocytes/microglia under the same tamoxifen or 4-OHT conditions? This would help clarify how the interpret the macrophage labeling %'s.

      We will report labelling efficacies and/or show normalized data in the revised manuscript.

      Could the authors draw parallels between the observations in the mouse knee joint macrophage populations and literature on other joints in mouse and the knee joint in human (for example, as described in Alivernini et al., 2020 and in the very recent Raut et al., 2025)?

      We will include a section on this in the revised manuscript.

      Reviewer #3 - Minor comments:

      In general, the authors should clarify in the Results what each marker used for imaging, flow cytometry, or in the mouse reporter lines delineates. For example, mention that F4/80 is a marker for tissue-resident macrophages (correct?) in immunofluorescence, that IBA1 is a marker for macrophages on human tissue sections (Figure S1), and PDPN is GP38 (Figure S2 - align usage of marker reference across main text and figures).

      We will implement this request.

      Figure S1B: Is CX3CR1 also restricted to the lining macrophages in human? Could a co-staining with IBA1 be performed to strengthen the species similarities?

      To our knowledge, there is no antibody available that works for imaging of human CX3CR1. Moreover, CX3CR1 is only limited to the lining population in adult joints, in fetal and newborn (mouse) joints, all macrophages express this receptor, as do fetal progenitors to macrophages. However, Alivernini and colleagues have reported that TREM2high macrophages are the human counterpart of the mouse CX3CR1+ lining population (PMID: 32601335). We do not have access to postnatal human joint tissue samples, unfortunately, but we will attempt to stain for and quantify TREM2+ macrophages in human fetal joints for the revised manuscript.

      Adipocyte diameter quantification: Avoid plotting individual adipocytes from 2 mice without per-mouse visualization. Instead, report the mean adipocyte diameter per mouse and plot those means.

      We will implement this change.

      A little typo was spotted in the "Statistical analysis and reproducibility" section: it is Dunn's, not Bunn's multiple-comparison correction.

      Thanks for spotting this.

      Figure 2A: The gating strategy for the CX3CR1-GFP cells is missing.

      We will provide this in the revised manuscript or supplementary material.

      Improve the visualization of some plots. For example, Figure 2F is hard to read because of the big dot size. The dots seem to add no information to the graph and could be removed. Additionally, for comparing the clusters across the different time points, one could project the cells from the other time points in grey in the background.

      We will revise the presentation of these data.

      Figure S2: The dotplot is more informative than the heatmap, consider removing the heatmap.

      We will do that.

      Figure 3A: If technically feasible, image and visualize both the GFP and tdTomato expression. It would be informative to see the Cx3cr1+ and Ms4a3-derived cells in the same specimen.

      We will strive to show this in the revised manuscript.

      Figure 3C: Highlight that tdTomato expression is visualized here.

      We will do that.

      Figure 3G,F: The authors should place the schematics and graphs next to each other, so the data points can be more easily compared.

      We aim to do this in the revised manuscript.

      Figure 4B: Which co-staining was performed for the immunofluorescence to quantify the % of tdTomato+ cells?

      We co-stained for F4/80 and assessed localization in the lining or sublining. This will be clarified in the revised Figure legend.

      Figure 4C: The trajectory analysis appears to have an arrow pointing from the Ccr2+ macrophages to the Ly6c+ monocytes. Please verify this directionality, as its seems against the known biology.

      This will be addressed during revision.

      Figure 5 mentions that the Csfr1 levels were reduced in a tissue-specific manner, but it is unclear how this tissue specificity was achieved.

      We apologize for this misunderstanding. Csfr1FIRE mice are not tissue-specific knockouts, but they are more specific than global knockout mice, since only a (myeloid-specific) enhancer is affected. We will clarify this in the relevant section.

      For the TGFb perturbations (Tgfbr2 KO and systemic TGFb depletion): did the authors validate reduced TGFb pathway activity in the macrophages, for example, reduced pSMAD2/3 levels? This would validate the effectiveness of the perturbations.

      This is an important point, and assessing signaling events downstream of TGFb is a very good suggestion. As per above comment, we have decided to uncouple the functional data with exception of CSF1 from the revised version of the current manuscript, but we will be taking this into account for substantiating our functional data in follow-up work.

      Figure 5F could benefit from a timeline of the treatment.

      As for 15., we will be taking this into account for follow-up work on the uncoupled functional data.

      The Methods mention that Gene Ontology analysis was performed on the single-cell data, but the results are not plotted in a figure. It would be informative to include this GO/pathway analysis in the appropriate figure(s).

      We will include this in the revised (supplementary) information.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Magalhaes Pinto, Malengier-Devlies, and co-authors investigated the developmental origins and maturation of synovial (lining and sublining) macrophages across embryonic, newborn, and postnatal stages in mouse. The authors used multiple transgenic reporter lines, lineage tracing, scRNA-seq, 2D confocal and 3D lightsheet imaging, and perturbations to delineate the macrophage states and ontogeny. They propose a model in which the majority of the joint lining macrophages has a fetal (EMP-derived) origin and a small proportion has a definitive HSC-derived monocyte origin, which both seed and mature within the synovial space in the postnatal period in the first 3 weeks of life. Using cell-cell communication analysis on their scRNA-seq data, they identified Fgf2, Csf1, and Tgfb as candidate signaling pathways that support (lining) macrophage development and maturation. Functional experiments indicate that the process is CSF1 and TGFb-dependent and also partly dependent on mechanosensing through Piezo1. The key conclusions on the composition of the synovial macrophages are convincing based on the presented results, and are carefully phrased. The study is very comprehensive, yet the description and organization of the results of the different mouse models could be altered to improve the storyline. Several refinements in data presentation, formulation, and minor validation experiments would further improve the clarity of the story, as well as summary recaps of the major findings throughout the text.

      Major comments:

      1. Generally, the story could be more streamlined by introducing earlier reporter lines and lineage-origin logic. Clearly state which reporter/CreERT2 lines and acrosses are used. It was unclear in Figure 2 that cells of the cross of the Cx3cr1-GFP and Ms4a3Cre:Rosa26lsl-tdT reporter lines were used for the scRNA-seq. The principle that there are fetal-derived and bone marrow (GMP)-derived monocytes and macrophages doesn't need to be "hidden" until Figure 3. For example, also the imaging of Ms4a3Cre could be introduced before the scRNA-seq.
      2. Figure 1 could benefit from a cartoon visualizing the anatomy of the knee joint. The terms "sublining" and "synovium" are now a bit unclear, as it appears that sometimes the synovium is indicated as sublining and vice versa. Additionally, a schematic developmental timeline could be added to indicate the parallels between mouse and human development (fetal and postnatal development in mouse versus gestational age in human). Also, the various waves of hematopoiesis could be indicated in this timeline, which would be particularly helpful for Figure 3 for the lineage-tracing readouts. Lastly, the authors could end the manuscript (a new Figure 6) with a general cartoon summarizing all the results presented.
      3. Figure 1 could be rearranged: first introduce the markers CX3CR1 and VSIG4 (Figure 1D) and then present the quantifications (Figure 1B/E). Where possible, co-visualization CX3CR1-GFP and VSIG4 on tissue sections to strengthen the claims on the relationship between these 2 markers. Tying the scRNA-seq insights (Figure 2) to the imaging would be elegant. Moreover, it would be informative to represent the CX3CR1+ and VSIG4+ macrophages as a percentage of F4/80+ macrophages (Figure 1B/E). Similarly, for the flow cytometry data in Figure 2, the relationship between the markers CX3CR1 and VSIG4 on macrophages could be more clearly displayed and discussed.
      4. The 3D imaging of the joint is a nice addition to the manuscript, as it provides more context to the anatomical structure; however, while the text suggests several newborn joints were imaged, Figure 1F visualizes (again) the knee joint. Could other joints also be represented by 3D imaging? If the knee joint is the only joint available for imaging, and previous confocal imaging focused specifically on the meniscus in the knee joint, could the meniscus also be highlighted in the lightsheet imaging?
      5. Clarification is requested regarding the imaging quantification representation. The M&M section under "Statistical analysis and reproducibility" states that individual data points are displayed, and bars represent the mean. However, some of the Figure legends (e.g., Figures 1B and S1C) specify that each dot corresponds to an individual mouse, with quantification based on 2-3 sections per mouse. While this appears to be a very reasonable representation of the data, does this mean that for each dot, the mean value from the 2-3 sections per mouse was calculated and plotted?
      6. It is not clear how the differential expression analysis was performed on the Vsig4+ cells. Please specify if Cluster 0 was used for analysis, or all Vsig4-expressing cells? Not all cells in Cluster 0 have Vsig4+ expression. The authors described the expression dynamics of Aqp1 as intriguing, but lack a reasoning on why this is interesting.
      7. Figure S3E: In line with the previous comment, can the authors justify that the tdTomato+/- comparisons are not biased by scRNA-seq dropout (scRNA-seq is zero-inflated, so some tdTomato- cells could be false negatives), and provide methodological details (thresholds, ambient RNA correction, etc.) to support this?
      8. Although the sex-related differences in macrophage composition and the absence of differential expression are interesting, they distract from the manuscript's main messages. Moreover, the Discussion does not elaborate on how these observations relate to joint (disease) biology. Consider removing this section or integrating it clearly into the relevant biological context.
      9. CreERT2 transgenic lines are often not 100% efficient in recombination, also depending on whether tamoxifen or 4-OHT is used. Could the authors report the percentage of tdTomato+ cells in the joints and compare them to the recombination efficiencies in the monocytes/microglia under the same tamoxifen or 4-OHT conditions? This would help clarify how the interpret the macrophage labeling %'s.
      10. Could the authors draw parallels between the observations in the mouse knee joint macrophage populations and literature on other joints in mouse and the knee joint in human (for example, as described in Alivernini et al., 2020 and in the very recent Raut et al., 2025)?

      Minor comments:

      1. In general, the authors should clarify in the Results what each marker used for imaging, flow cytometry, or in the mouse reporter lines delineates. For example, mention that F4/80 is a marker for tissue-resident macrophages (correct?) in immunofluorescence, that IBA1 is a marker for macrophages on human tissue sections (Figure S1), and PDPN is GP38 (Figure S2 - align usage of marker reference across main text and figures).
      2. For clarity in the microscopy representation, the single channels should be represented in a grey scale.
      3. Figure S1B: Is CX3CR1 also restricted to the lining macrophages in human? Could a co-staining with IBA1 be performed to strengthen the species similarities?
      4. Adipocyte diameter quantification: Avoid plotting individual adipocytes from 2 mice without per-mouse visualization. Instead, report the mean adipocyte diameter per mouse and plot those means.
      5. A little typo was spotted in the "Statistical analysis and reproducibility" section: it is Dunn's, not Bunn's multiple-comparison correction.
      6. Figure 2A: The gating strategy for the CX3CR1-GFP cells is missing.
      7. Improve the visualization of some plots. For example, Figure 2F is hard to read because of the big dot size. The dots seem to add no information to the graph and could be removed. Additionally, for comparing the clusters across the different time points, one could project the cells from the other time points in grey in the background.
      8. Figure S2: The dotplot is more informative than the heatmap, consider removing the heatmap.
      9. Figure 3A: If technically feasible, image and visualize both the GFP and tdTomato expression. It would be informative to see the Cx3cr1+ and Ms4a3-derived cells in the same specimen.
      10. Figure 3C: Highlight that tdTomato expression is visualized here.
      11. Figure 3G,F: The authors should place the schematics and graphs next to each other, so the data points can be more easily compared.
      12. Figure 4B: Which co-staining was performed for the immunofluorescence to quantify the % of tdTomato+ cells?
      13. Figure 4C: The trajectory analysis appears to have an arrow pointing from the Ccr2+ macrophages to the Ly6c+ monocytes. Please verify this directionality, as its seems against the known biology.
      14. Figure 5 mentions that the Csfr1 levels were reduced in a tissue-specific manner, but it is unclear how this tissue specificity was achieved.
      15. For the TGFb perturbations (Tgfbr2 KO and systemic TGFb depletion): did the authors validate reduced TGFb pathway activity in the macrophages, for example, reduced pSMAD2/3 levels? This would validate the effectiveness of the perturbations.
      16. Figure 5F could benefit from a timeline of the treatment.
      17. The Methods mention that Gene Ontology analysis was performed on the single-cell data, but the results are not plotted in a figure. It would be informative to include this GO/pathway analysis in the appropriate figure(s).

      Significance

      This work provides a high temporal-resolution and "spatial" resolution reference map of the ontogeny and maturation of the synovial lining macrophages in the knee joint. It complements existing literature that demonstrated the presence of tissue-resident macrophages in the synovial space and lining (Culemann, et al., 2019 and others) by charting the embryonic-to-postnatal emergence of lining and sublining subsets. In particular, this mouse work identified some key signaling pathways in shaping this tissue compartment. This dataset serves as a robust, steady-state reference for joint pathology and can be implemented with human studies on disease biology of the knee joint (e.g., Alivernini et al., 2020; Raut et al., 2025). Insights into the exact developmental origins, mechanisms contributing to diverse or seemingly similar cell types, and distinct maturation processes are crucial to understanding disease biology, in which developmental processes can be hijacked/reactivated.

      These findings will interest researchers in joint disease biology (osteoarthritis and immune-mediated arthritides such as RA and psoriasis), macrophage development (tissue-resident vs monocyte-derived lineages), the bone/joint microenvironment, and joint mechanobiology.

      The reviewer's expertise is in developmental biology, mesoderm, bone biology, hematopoiesis, and monocyte/macrophage biology in disease

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript „The synovial lining macrophage layer develops in the first weeks of life in a CSF1- and TGFβ- dependent but monocyte-independent process", Magalhaes Pinto and colleagues carefully employ a wide range of technologies including single cell profiling, imaging and an exceptional combination of fate mapping models to characterize the ontogeny and development of lining macrophages in the joint, thus dissecting their maturation during postnatal development. Over the last decade, several landmark studies highlighted the imprinting of tissue-resident macrophages by a combination of ontogenetic and tissue-specific niche factors during development. So far, the ontogeny and the tissue niche factors governing the development and maturation of lining macrophages have not been described. Therefore, the results of this study offers insights on a small highly adapted macrophage population with relevance in many disease settings in the joint. Furthermore, the findings are nicely showcasing how macrophages are specializing to even very small tissue niches across development within one bigger anatomical compartment to serve dedicated functions within this niche.

      This manuscript is beautifully written and highlights many novel, highly relevant findings on lining macrophage biology and the authors employ a wide range of different technologies to carefully dissect the postnatal development of lining macrophages.

      In particular, the combination of scRNA-seq and fate mapping is providing a unique the link of transcriptional programs to ontogeny within the tissue niche. Furthermore, the integrative use of distinct fate mapping strategies, transgenic mouse lines, and treatment paradigms to elucidate key niche factors guiding the development and maturation of lining macrophages provides many interesting findings and data that are highly relevant to the field. I really enjoyed reading this manuscript.

      Major points:

      1) The authors show dynamic regulation of VSIG4 in lining macrophages during development, therefore VSIG4 is maybe not an ideal choice for gating strategies to define lining macrophages or to show as a single markers in immunofluorescence (IF) stainings to demonstrate their abundance across development (even though it is clear that this is the reason why the F4/80 staining is shown next to it). To demonstrate the increase of lining macrophages during development in IF, it would be more helpful if the authors would show quantifications of all F4/80+ cells and additionally VSIG4+ as a proportion of F4/80+ cells (or VSIG4+ F4/80+ and all F4/80+ in a stacked bar plot).

      2) In Figure 1C, the authors nicely demonstrate that the lining macrophages get closer in their distance across development to build the epithelial-like macrophage structure along the adult lining. Is the close proximity between lining macrophages already fully "matured" at 3 weeks of age and comparable to adults? Please quantify the distance in adult linings.

      3) Can the authors explain how the grouping was performed between the analyzed human fetal joints? It is not clear why the cut was chosen between the groups at 16/17 weeks of age. Maybe it would be also beneficial if the authors would consider not grouping these samples but rather show the specific quantifications for each samples individually and estimate via linear regression the expansion over time across human development. Furthermore, can the authors give additional information about the distancing of lining macrophages in the human fetal samples, it would be great to see if they follow the same dynamics as in mouse. Maybe comparison to human juvenile/adult joints would also add on to substantiate the findings in human samples (if possible).

      4) The scRNA-seq analysis leaves several questions open and some conclusions and workflows cannot be easily followed.

      a. It is not clear how and especially why the signature genes to define macrophages vs. monocytes were chosen. Especially as the signature genes for monocytes would not include patrolling monocytes and the macrophage signature genes seem to be highly regulated during development, see also Apoe expression in NB vs. adult in Figure S2e. Why did the authors not take classical markers such as Itgam, Fcgr1a, Csf1r?

      b. Can dendritic cell signatures be excluded? Cluster 11 and 12 show indeed some DC markers, are these really macrophages?

      c. The authors provide several figure panels showing TOP marker genes or key marker genes for the identified clusters, however it is not clear if these are TOP DE genes or if the genes were hand chosen. Somehow, the authors give the impression that the clusters were chosen and labeled not based on DE genes, but more on existing literature that previously reported these macrophage populations. DE gene lists for all annotated cell types and macrophage clusters need to be provided within the manuscript.

      d. The authors claim that Clusters 1 and 4 are "developing" macrophages. How is this defined? Why are these developing cells compared to other clusters? And why are these clusters later on not considered as progenitors of Aqp1 macrophages and Vsig4 macrophages? Why are Aqp1+ macrophages not labeled as developing when they are later on in the manuscript shown as potential intermediate progenitors of lining macrophages?

      e. Furthermore, it is again confusing that markers are used throughout Figure 2 which are labeled as "key marker genes" for a population and then later on they are claimed to be regulated during development within this population, see for example Figure 2D and 2H.

      f. It is appreciated that the authors distinguished cycling clusters such as 8, 9, and 10 based on their cycling gene signature. Here it would be very exciting to see a cell cycle analysis across all clusters and time points to see when exactly the cells are expanding during development; this would also substantiate the data later shown for the Mki67-CreERT2 mouse model.

      g. Can the authors identify certain gene modules during development of lining macrophages (and/or their progenitors) which are associated with certain functions (e.g. GO terms, GSEA enrichment)?

      5) To determine the actual presence of the identified macrophage clusters from the scRNA-seq as macrophage populations in the joint, the authors should perform IF or FACS for key markers. Especially, Aqp1+ macrophages should be shown in the developing joint.

      6) The authors used a wide range of fate mapping models, which is quite unique and highly appreciated. The obtained results and the conclusions made from the models raise a couple of questions: Whereas contribution of HSC-derived/monocyte-derived macrophages to the lining compartment seems to be minor, there is still labeling across different models. Various aspects would need to be clarified.

      a. For example, the authors employ Ms4a3-Cre as a tracing model for GMP-derived monocytes, however all quantifications of the labeling efficiency are not normalized to the labeling in monocytes or another highly recombined cell population. This should be shown, similar to the other fate mapping models (Figure 3 F-I).

      b. Please show Ms4a3 expression across clusters across time points, to exclude expression in fetal-derived clusters.

      c. In line with the question raised above, if the authors can exclude a development of the Egfr1+ and Clec4n+ developing macrophages into Aqp1+ macrophages and subsequently into Vsig4 lining macrophages, the obtained data from the Ms4a3-Cre model highly suggests a correlative labeling across these clusters what could implicate a relation. However, the authors do not discuss throughout the manuscript the role of these developing macrophages. It is highly encouraged to include this into the manuscript and it would be of high relevance to understand lining macrophage development.

      d. The authors conclude from the pseudo bulk transcriptomic profiling of the different macrophage clusters that TdT+ and TdT- macrophages do not differ in their gene expression profile and that this is due to niche imprinting rather than origin imprinting. Even though the data supports that conclusion, the authors should verify if inkling cells early during development also show this similar gene expression profile and gene expression should be compared at the different developmental time points. Tissue niche imprinting is happening within the niche during development, most likely in a stepwise progress, and therefore there should be differences in the beginning.

      7) The trajectorial analysis using different pseudotime pipelines is very interesting and nicely points out the potential role of Aqp1 macrophages as intermediates of Vsig4 lining macrophages. From my point of view, all trajectories seem to suggest that Egfr1 developing macrophages and Clec4n developing macrophages might differentiate into Aqp1 macrophages, however the authors are not exploring this further and the role of both developing macrophage clusters is not further discussed (see also comments above).

      8) How was the starting point of the trajectorial analyses defined and is it the same for each pipeline used?

      9) Are there potentially two trajectories? It looks like there is one in the beginning of postnatal life and a second one appearing from the monocyte-compartment later in life. If this is true, that would rather speak for a dual ontogeny of Vsig4+ macrophages, wouldn't it?

      10) A heatmap (transcriptional shift) of trajectories between more clusters should be shown at least for Cluster 0,1,2, and 3. It is not sufficient to demonstrate this only between two clusters.

      11) To show the similarity between Aqp1 macrophages and proliferating macrophage clusters, the authors should remove the cycling signature and compare these clusters to show that the cycling cells might be Aqp1 macrophages or earlier developing macrophage progenitors aka Clec4n or Egfr1 macrophages.

      12) The conclusions made from the Mki67-CreERT2 data are a bit difficult to understand, whereas all progenitors (monocyte progenitors and macrophage progenitors will proliferate at the neonatal time point and no conclusions can be made if the cells expand in the niche. The authors should employ Confetti mice or other models (Ubow mice) to analyze clonal expansion in the niche.

      13) All predicted cell-cell interactions between macrophages and fibroblasts should be provided in a supplementary table. Are the interactions shown in Figure 5 chosen interactions or the TOP predicted ones? Whereas the authors show different numbers of interactions, it is most likely hand-picked and therefore biased.

      14) The authors further aim to dissect the factors involved in the developmental niche imprinting of lining macrophages. Even though it is highly appreciated that the authors used so many experimental setups to show the reliance of lining macrophages on Csf1 and TGF-beta as well as mechanosensation, the wide range of models the different methods used and selected developmental time points make it very difficult to really interpret the data. The authors should carefully choose time points and methods (either FACS analysis across all models or IF across all, or both). Often deletion efficiencies for transgenic models and proof of concept that the inhibitors and agonists are working in the treatment paradigm are not provided. For example, Csf1rMer-iCre-Mer Tgfbr2fl/fl mice are used but no deletion efficiency is shown or different time points of analysis, maybe the macrophages are not properly targeted in the set up.

      15) The authors have shown the role of Csf1 and Tgfbr2 only for lining macrophages, is this specific in the joint to this population of are subliming macrophages affected in a similar manner.

      16) Can the authors confirm their results in CSF1R-FIRE mice with anti-Csf1 injections or in Csf1op/op mice?

      17) The setup in Figure S5G is very interesting to test the role of movement and mechanical load on the joint, however, there is basically no data on the model provided showing the efficiency of the induced optogenetic muscle contractions, and only one time point is shown.

      18) The results regarding the role of Piezo1 and mechanosensation vary a lot. Could it be that analyses were done too early or that actually proper weight load on the joint must be applied for the maturation of the macrophages? The authors should test this to

      19) The Rolipram experiment is shown in Figure S5G, but is not described in the result section. It only appears at some point in the discussion part. The authors should move it to results or remove it from the manuscript.

      Minor points:

      1) Please reference the Figure panels in numeric order throughout the text.

      2) Figure 2a and 2b are a bit out of the storyline, it is not obvious why this is shown here and maybe it would be good to move it to the supplements. Gating strategy is also not used for scRNA-seq. Therefore, it would better fit to the later analysis of joint macrophages across different transgenic mouse models and treatment paradigms. The gating strategies are changing across different experiments throughout the figures, it would be nice to have a similar gating strategy for all experiments, see also Figure 3 where the defining markers for joint macrophages are changing between models.

      3) A lot of figure panels have very small labeling that is basically unreadable. Axes at FACS plots for example. Sometimes, it is even impossible to distinguish cluster labels especially when they have similar colors.

      4) In the text on page 14, many markers are named which are specifically regulated during development in lining macrophages, but these factors are not labeled anywhere in the volcano plot. It would be good to showcase at least some of these named genes in the figure panel, e.g. Trem2.

      5) Figure 2F and Figure S2F are really nicely showing the percentage of cells per cluster in each analyzed biological sample. Maybe the authors could additionally consider to show a stacked bar plot with the mean percentage of cells per cluster and how the clusters are distributed across time points?

      6) Figure 3A: IF for adult lining macrophages and the quantification are missing

      Significance

      This manuscript highlights novel, highly relevant findings on lining macrophage biology and the authors employ a wide range of different technologies to carefully dissect the postnatal development of lining macrophages. Furthermore, this study showcases in a very elegant and detailed way the adaptation of macrophage progenitors to a highly specific anatomical tissue niche.

      The manuscript is of high interest to basic scientists focussing on macrophage biology and immune cell development and clinicians and clinician scientists focussing on joint diseases such as RA

      Therefore the manuscript is of interest to a wide community working in immunology.

    1. Assistant linked to federal registration and award history for Arayna Technology Solutions (UEI LQKTFZMD19B3), update here. Upload a capability statement or catalogue for your company here or upload an opportunity-specific document here Some suggestions:Please summarize the work to be completed under this opportunityDo the documents mention an incumbent contractor?Does this contract have any security clearance requirements?I'd like to anonymously submit a question to the procurement officer(s) Do they need certified people or something similar Loading Yes, the solicitation explicitly requires that all primary technical services be performed by individuals who are Mitel/Unify certified, qualified, and experienced in performing maintenance and repair of the specified equipment. Specifically, the contractor must:- Be a Mitel/Unify Systems certified partner authorized to perform on-network installations and maintenance.- Ensure that technicians have a complete understanding of system design, installation diagrams, and maintenance records for the equipment listed in the PWS.- Meet the following Air Force requirements for access: - Air Force Cyber Readiness Training (Annual) - CompTIA Security+ CertificationThese requirements are detailed in Section 4.0 of the [Performance Work Statement (PWS)](https://www.highergov.com/document/attachment-1-ellsworth-28th-bw-unify-sustainment-pws-pdf-4c2c3b/). Submit × Opportunity Assistant Document Library Capability Statements Opportunity Documents Drop or click to upload a Capability Statement (.pdf, .docx, or .xlsx) Close Select Document for Processing Select a document for analysis. Submit Match Score Factors Contributing to Match Registered with primary NAICS 541519 matches this opportunity. Registered for NAICS 541519 - Other Computer Related Services This opportunity is open to competition Potential Issues or Gaps Minimal or no matching keywords found for your capabilities May have limited or no recent experience with NAICS 541519 - Other Computer Related Services Registration not found for PSC DG10 - IT And Telecom - Network As A Service May have limited or no recent experience with PSC DG10 - IT And Telecom - Network As A Service Little or no prior performance with Air Combat Command This opportunity was marked No Bid by: Arayna Description Original Summary Original Summary The 28th Contracting Squadron, Ellsworth AFB, South Dakota, has issued this Solicitation, FA469025Q0062, to compete and award a Firm-Fixed Price Contract for sustainment and repair for the Mitel/Unify OpenScape Systems located at Ellsworth Air Force Base, South Dakota. This requriement is being solicited Full & Open (No Small Business Set-Aside), limited to Unify OpenScape Brand Name items and certified Technicians. Attached to this Solicitation Notice are: Standard Form 1449 Solicitation Attachment 1 - Performance Work Statement (PWS) Attachment 2 Wage Determination All questions and comments on this solicitation must be submitted in writing to joshua.johnson.233@us.af.mil no later than Wednesday, 10 September 2025 by 12:00pm MDT. Please title all emails with questions with the following subject line: FA469025Q0062 - Unify Mainteance. All interested and responsible entities are invited to submit a quote that will be considered by the 28th Contracting Office at Ellsworth AFB, South Dakota. The award will be based on the criteria established in the solicitation. Vendor quotes and all items required as listed within the Addendum to 52.212-1, Instruction to Offerors, are due to be submitted no later than Wednesday, 17 September 2025 by 3:00pm MDT to joshua.johnson.233@us.af.mil. Please title all submissions with the following subject line: FA469025Q0062 - Unify Maintenance Contractors submitting a quote must have and list within the quote their assigned Cage Code and be registered and ACTIVE in the System of Award Management (SAM) at www.sam.gov to be eligible for award. Auto-generated summaries available on select opportunities Background The 28th Contracting Squadron at Ellsworth Air Force Base, South Dakota, is issuing Solicitation FA469025Q0062 for a Firm-Fixed Price Contract aimed at the sustainment and repair of Mitel/Unify OpenScape Systems. This requirement is open to all vendors (Full & Open) but is limited to Unify OpenScape Brand Name items and certified technicians. The goal of this contract is to ensure the operational integrity of the telecommunications systems at the base.Work Details The contractor shall provide all personnel, equipment, tools, materials, supervision, and any other items and services necessary to ensure that the Unify system is operational. Key tasks include: - Performing maintenance and repair on Mitel/Unify telecommunications hardware and software installed throughout Ellsworth AFB. - Diagnosing and resolving system issues for applications such as OpenScape Voice, OpenScape Xpert, iNemsoft radio interface, and ASC voice. - Ensuring that all technical services are performed by individuals who are certified, qualified, and experienced in Mitel/Unify systems. - The scope of work includes sustainment and repair of specific equipment listed in Section 6.0 Equipment List of the Performance Work Statement (PWS).Period of Performance The contract will have a base period of 12 months with four additional option years, each lasting 12 months.Place of Performance Ellsworth Air Force Base, South Dakota. Show Less List Text Overview Agency Air Combat Command (ACC) [DoD - USAF] Response Deadline Sept. 17, 2025, 5:00 p.m. EDT Due in 12 Days Posted Sept. 4, 2025, 12:14 p.m. EDT Set Aside None NAICS 541519 - Other Computer Related Services PSC DG10 - IT And Telecom - Network As A Service Place of Performance Ellsworth AFB, SD 57706 United States Source Open Current SBA Size Standard $34 Million Pricing Fixed Price Est. Level of Competition Average Est. Value Range Experimental $50,000 - $150,000 (AI estimate) On 9/4/25 Air Combat Command issued Solicitation FA469025Q0062 for Unify OpenScape Maintenance due 9/17/25. The opportunity was issued full & open with NAICS 541519 and PSC DG10. Primary Contact Name Marc L Bellucci   Profile Email marc.bellucci.1@us.af.mil Phone (605) 385-1782 Secondary Contact Name Joshua Johnson   Profile Email joshua.johnson.233@us.af.mil Phone (605) 385-1734 Download All Explore Documents Posted documents for Solicitation FA469025Q0062 5102050100 ShownDocumentAgencyPosted DateSourceDownloadDocumentAgencyPosted DateSourceDownload Attachment 1 - Ellsworth 28th BW Unify Sustainment PWS.pdf Air Combat Command 09/04/25Contract Opportunity Text Snapshot This performance work statement (PWS) outlines the requirements for the sustainment and repair of Mitel/Unify systems at Ellsworth Air Force Base, specifically for the 28th Bomb Wing (BW). The contractor is tasked with providing all necessary personnel, equipment, tools, materials, and supervision to ensure the operational status of the Unify system. Key services include maintenance and repair... Attachment 3 - single source justification Redacted.pdf Air Combat Command 09/04/25Contract Opportunity Text Snapshot This single source justification is for a simplified acquisition related to the unify maintenance contract at ellsworth air force base (afb). the contracting activity is managed by the 28th contracting squadron, and the justification outlines the necessity for continuous maintenance, warranty support, software licenses, and timely updates for the existing unify openscape voice communication... Solicitation - FA469025Q0062.pdf Air Combat Command 09/04/25Contract Opportunity Text Snapshot This solicitation (FA469025Q0062) is for commercial products and services, specifically focused on Unify maintenance services. The solicitation outlines the requirements for a 12-month maintenance contract with options for four additional 12-month periods. The pricing arrangement is firm fixed price, and the total quantity required is specified as one unit for each period. The solicitation is... Attachment 2 - WD 15-5367 (Rev 29) dated 08jul25 (1).pdf Air Combat Command 09/04/25Contract Opportunity Text Snapshot This wage determination is issued by the U.S. Department of Labor, specifically under the Service Contract Act, with wage determination no. 2015-5367 and revision no. 29, dated July 8, 2025. It outlines the minimum wage rates and fringe benefits that contractors must pay to workers performing on federal service contracts in South Dakota, particularly in the counties of Custer, Meade, and... Question & Answer The AI Q&A Assistant has moved to the bottom right of the page Export Visible Records Clipboard CSV Excel All Records CSV Excel Opportunity Lifecycle Procurement notices related to Solicitation FA469025Q0062 5102050100 ShownTitleTypeAgencySet AsidePostedDeadlineDescriptionTitleTypeAgencySet AsidePostedDeadlineDescription Unify OpenScape Maintenance  25%Solicitation Air Combat Command None09/04/2509/17/25Description The 28th Contracting Squadron, Ellsworth AFB, South Dakota, has issued this Solicitation, FA469025Q0062, to compete and award a Firm-Fixed Price Contract for sustainment and repair for the Mitel/Unify OpenScape Systems located at Ellsworth Air Force Base, South Dakota. This requriement is being solicited Full & Open (No Small Business Set-Aside), limited to Unify OpenScape Brand Name items and certified Technicians. Attached to this Solicitation Notice are: Standard Form 1449 Solicitation Attachment 1 - Performance Work Statement (PWS) Attachment 2 Wage Determination All questions ...show moreThe 28th Contracting Squadron, Ellsworth AFB, South Dakota, has issued this Solicitation, FA469025Q0062, to compete and award a Firm-Fixed Price Contract for sustainment and repair for the Mitel/Unify OpenScape Systems located at Ellsworth Air Force Base, South Dakota. This requriement is being solicited Full & Open (No Small Business Set-Aside), limited to Unify OpenScape Brand Name items and certified Technicians. Attached to this Solicitation Notice are: Standard Form 1449 Solicitation Attachment 1 - Performance Work Statement (PWS) Attachment 2 Wage Determination All questions and comments on this solicitation must be submitted in writing to joshua.johnson.233@us.af.mil no later than Wednesday, 10 September 2025 by 12:00pm MDT. Please title all emails with questions with the following subject line: FA469025Q0062 - Unify Mainteance. All interested and responsible entities are invited to submit a quote that will be considered by the 28th Contracting Office at Ellsworth AFB, South Dakota. The award will be based on the criteria established in the solicitation. Vendor quotes and all items required as listed within the Addendum to 52.212-1, Instruction to Offerors, are due to be submitted no later than Wednesday, 17 September 2025 by 3:00pm MDT to joshua.johnson.233@us.af.mil. Please title all submissions with the following subject line: FA469025Q0062 - Unify Maintenance Contractors submitting a quote must have and list within the quote their assigned Cage Code and be registered and ACTIVE in the System of Award Management (SAM) at www.sam.gov to be eligible for award. Unify OpenScape Maintenance  0%Sources Sought Air Combat Command None04/30/2505/09/25Description This Sources Sought / RFI is issued solely for market research purposes in accordance with Federal Acquisition Regulation (FAR) Part 10 and is not a solicitation for proposals. This notice does not obligate the Government to award a contract or otherwise pay for the information provided in response. The Government will use information received in response to this notice to determine the appropriate acquisition strategy for the requirement. The 28th Maintenance Group (28 MXG) at Ellsworth AFB, South Dakota requires the contractor to provide all personnel, equipment, tools, materials, ...show moreThis Sources Sought / RFI is issued solely for market research purposes in accordance with Federal Acquisition Regulation (FAR) Part 10 and is not a solicitation for proposals. This notice does not obligate the Government to award a contract or otherwise pay for the information provided in response. The Government will use information received in response to this notice to determine the appropriate acquisition strategy for the requirement. The 28th Maintenance Group (28 MXG) at Ellsworth AFB, South Dakota requires the contractor to provide all personnel, equipment, tools, materials, supervision and any other items and services necessary to accomplish maintenance required. The primary technical services shall be performed by individuals who are Mitel/Unify certified, qualified, and experienced in performing maintenance and repair of equipment, crisis management, dispatch consoles, and all associated Unify OpenScape telecommunications hardware and software installed throughout Ellsworth 28 BW. PWS is attached to this RFI. NOTE: IF YOU DO NOT INTEND TO SUBMIT A PROPOSAL FOR THIS PROJECT WHEN IT IS FORMALLY ADVERTISED, PLEASE DO NOT SUBMIT A RESPONSE TO THIS SOURCES SOUGHT / RFI. Information requested: All interested parties are invited to provide information about your company/institution, or any teaming or joint venture partners. Interested vendors are requested to submit the following information, clearly indicating whether you are providing information: Company name, address, point of contact with phone number and email address, CAGE code, business size status (e.g., small business, large business), and website (if applicable). Manufacturer and model number. Detailed product specifications and brochures. Maintenance requirements and service support Availability and lead time. The 28 MXG will review all vendors who respond to this sources sought to determine if other companies can perform the required repairs; and if aftermarket parts can meet the government’s needs. Submission Instructions: All responses must be submitted electronically to marc.bellucci.1@us.af.mil and joshua.johnson.233@us.af.mil no later than 4:00 PM MDT, Thursday, 09 May 2025. Please include "Sources Sought / Mitel/Unify Annual Sustainment. Questions relevant to this notice shall be sent electronically to the above email address. NO PHONE INQUIRIES WILL BE ACCEPTED. All communication shall be in writing and submitted electronically with reference " Mitel/Unify Annual Sustainment.” Disclaimer: This Sources Sought / RFI is issued solely for information and planning purposes only and does not constitute a solicitation. The Government is not obligated to award a contract as a result of this announcement. No reimbursement will be made for any costs associated with providing information in response to this announcement or any follow-up requests. The Government shall not be liable for or suffer any consequential damages for any improperly identified information. Incumbent or Similar Awards Contracts Similar to Solicitation FA469025Q0062 510 ShownAward IDAwardeeAwarding AgencyPotential ValueSet AsideStartEndSimilarityDescriptionAward IDAwardeeAwarding AgencyPotential ValueSet AsideStartEndSimilarityDescription FA469022C0004Advancia Aeronautics  Air Combat Command $370.2K8AN07/26/2208/09/25 Description 1 FTE MEDICAL IT SUPPORT TECHNICIAN47QTCA21A001G-FA469025FG018Impres Technology Solutions  Air Combat Command $10.8KNone10/01/2409/30/25 Description IT AND TELECOM -NETWORK: DIGITAL NETWORK PRODUCTSNNG15SC41B-FA486123F0261Iron Bow Technologies  Air Combat Command $4.0MNone07/11/2309/30/25 Description 805 COMBAT TRAINING SQUADRON SPECIAL ACCESS PROGRAM DIGITAL ENVIRONMENT (SAP DE) EQUIPMENT, CONFIGURATION AND INSTALL. Potential Bidders and Partners Awardees that have won contracts similar to Solicitation FA469025Q0062 Explore in Partner Finder Advancia Aeronautics 2024 Obligations: $35.4 million Microtechnologies 2024 Obligations: $142.9 million ENSCO 2024 Obligations: $98.9 million CDO Technologies 2024 Obligations: $13.6 million Referentia Systems 2024 Obligations: $9.1 million UIC Government Services 2024 Obligations: $369.6 million RTX 2024 Obligations: $30.4 billion World Wide Technology 2024 Obligations: $733.9 million Similar Active Opportunities Open contract opportunities similar to Solicitation FA469025Q0062 Experiments, Prototypes, Research, and Evaluation Supporting Systems (EXPRESS) Agency: Air Force Research Laboratory (AFRL) [DoD - USAF - AFMC] Deadline: Feb. 14, 2027, 5:00 p.m. EST Type: Solicitation Set Aside: None NAICS: 541715 - Research and Development in the Physical, Engineering, and Life Sciences (except Nanotechnology and Biotechnology) Hush House Inspection Agency: Pacific Air Forces (PACAF) [DoD - USAF] Deadline: Sept. 23, 2025, 10:00 p.m. EDT Type: Synopsis Solicitation Set Aside: None NAICS: 541350 - Building Inspection Services Combat Identification Automated Target Recognition Technology (CATCH) Call 03 Agency: Department of the Air Force (USAF) [DoD] Deadline: Sept. 18, 2025, 5:00 p.m. EDT Type: Solicitation Set Aside: None NAICS: 541715 - Research and Development in the Physical, Engineering, and Life Sciences (except Nanotechnology and Biotechnology) JRE Help Desk Agency: Air Combat Command (ACC) [DoD - USAF] Deadline: Sept. 9, 2026, 4:00 p.m. EDT Type: Solicitation Set Aside: None NAICS: 541512 - Computer Systems Design Services DATA CENTER MONITORING MODERNIZATION Agency: U.S. Air Forces Europe and Africa (USAFE) [DoD - USAF] Deadline: Sept. 15, 2025, 11:00 a.m. EDT Type: Synopsis Solicitation Set Aside: None NAICS: 541513 - Computer Facilities Management Services THUNDER COMMERCIALLY AUGMENTED MISSION PLATFORM (CAMP): DEVSECOPS SOFTWARE LICENSE Agency: Air Force Sustainment Center (AFSC) [DoD - USAF - AFMC] Deadline: Sept. 20, 2025, 4:00 p.m. EDT Type: Sources Sought Set Aside: None NAICS: 541519 - Other Computer Related Services Additional Details Source Agency Hierarchy DEPT OF DEFENSE > DEPT OF THE AIR FORCE > AFGSC > FA4690 28 CONS PKC FPDS Organization Code 5700-FA4690 Source Organization Code 500022516 Last Updated Sept. 4, 2025 Last Updated By joshua.johnson.233@us.af.mil Archive Date Oct. 2, 2025 search_params = {} table = 'contract_opportunity' key = 'FA469025Q0062-Solicitation-62400' sol = "FA469025Q0062" sol_clean = "FA469025Q0062" key3 = 'c28f08e1161a4ddfb1151beb3b162400' path_key = 'FA469025Q0062-Solicitation-62400' display = 'Solicitation - Unify OpenScape Maintenance' download_params = {"fed_opportunity": {"code": [{"key": `${key}`, "display": `${display}`}], "include": "Include"}, "id": "download"} show_bidders = true award_flag = false contract_flag = false idv_flag = false incumbent_flag = 'Exists' //true award_notification_threshold = 0 type_code = 'o' dibbs_flag = false store_recent_flag = true enable_expander = true track_key = "FA469025Q0062" //defining a separate key to track here than the default key transaction_key = 'c28f08e1161a4ddfb1151beb3b162400' api_transaction_key = 'c28f08e1161a4ddfb1151beb3b162400' opp_source = 'sam' solicitation_year = 'None' topic_code = 'None' nsn = 'None' no_link_flag = 'false' show_pricing = false show_supplier = false About Contact Terms Privacy © 2025 HigherGov

      Specifically, the contractor must:

      • Be a Mitel/Unify Systems certified partner authorized to perform on-network installations and maintenance.
    1. Reviewer #1 (Public review):

      Summary:

      This work provides important new evidence of the cognitive and neural mechanisms that give rise to feelings of shame and guilt, as well as their transformation into compensatory behavior. The authors use a well-designed interpersonal task to manipulate responsibility and harm, eliciting varying levels of shame and guilt in participants. The study combines behavioral, computational, and neuroimaging approaches to offer a comprehensive account of how these emotions are experienced and acted upon. Notably, the findings reveal distinct patterns in how harm and responsibility contribute to guilt and shame and how these factors are integrated into compensatory decision-making.

      Strengths:

      (1) Investigating both guilt and shame in a single experimental framework allows for a direct comparison of their behavioral and neural effects while minimizing confounds.

      (2) The study provides a novel contribution to the literature by exploring the neural bases underlying the conversion of shame into behavior.

      (3) The task is creative and ecologically valid, simulating a realistic social situation while retaining experimental control.

      (4) Computational modeling and fMRI analysis yield converging evidence for a quotient-based integration of harm and responsibility in guiding compensatory behavior.

      Weaknesses:

      (1) Post-experimental self-reports rely both on memory and on the understanding of the conceptual difference between the two emotions. Additionally, it is unclear whether the 16 scenarios were presented in random order; sequential presentation could have introduced contrast effects or demand characteristics.

      (2) In the neural analysis of emotion sensitivity, the authors identify brain regions correlated with responsibility-driven shame sensitivity and then use those brain regions as masks to test whether they were more involved in the responsibility-driven shame sensitivity than the other types of emotion sensitivity. I wonder if this is biasing the results. Would it be better to use a cross-validation approach? A similar issue might arise in "Activation analysis (neural basis of compensatory sensitivity)."

      Additional comments and questions:

      (1) Regarding the traits of guilt and shame, I appreciate using the scores from the subscales (evaluations and action tendencies) separately for the analyses (instead of a composite score). An issue with using the actions subscales when measuring guilt and shame proneness is that the behavioral tendencies for each emotion get conflated with their definitions, risking circularity. It is reassuring that the behavior evaluation subscale was significantly correlated with compensatory behavior (not only the action tendencies subscale). However, the absence of significant neural correlates for the behavior evaluation subscale raises questions: Do the authors have thoughts on why this might be the case, and any implications?

      (2) Regarding the computational model finding that participants seem to disregard self-interest, do the authors believe it may reflect the relatively small endowment at stake? Do the authors believe this behavior would persist if the stakes were higher? Additionally, might the type of harm inflicted (e.g., electric shock vs. less stigmatized/less ethically charged harm like placing a hand in ice-cold water) influence the weight of self-interest in decision-making?

      Taken together, the conclusions of the paper are well supported by the data. It would be valuable for future studies to validate these findings using alternative tasks or paradigms to ensure the robustness and generalizability of the observed behavioral and neural mechanisms.

    2. Reviewer #3 (Public review):

      Summary:

      Zhu et al. set out to elucidate how the moral emotions of guilt and shame emerge from specific cognitive antecedents - harm and responsibility - and how these emotions subsequently drive compensatory behavior. Consistent with their prediction derived from functionalist theories of emotion, their behavioral findings indicate that guilt is more influenced by harm, whereas shame is more influenced by responsibility. In line with previous research, their results also demonstrate that guilt has a stronger facilitating effect on compensatory behavior than shame. Furthermore, computational modeling and neuroimaging results suggest that individuals integrate harm and responsibility information into a composite representation of the individual's share of the harm caused. Brain areas such as the striatum, insula, temporoparietal junction, lateral prefrontal cortex, and cingulate cortex were implicated in distinct stages of the processing of guilt and/or shame. In general, this work makes an important contribution to the field of moral emotions. Its impact could be further enhanced by clarifying methodological details, offering a more nuanced interpretation of the findings, and discussing their potential practical implications in greater depth.

      Strengths:

      First, this work conceptualizes guilt and shame as processes unfolding across distinct stages (cognitive appraisal, emotional experience, and behavioral response) and investigates the psychological and neural characteristics associated with their transitions from one stage to the next.

      Second, the well-designed experiment effectively manipulates harm and responsibility - two critical antecedents of guilt and shame.

      Third, the findings deepen our understanding of the mechanisms underlying guilt and shame beyond what has been established in previous research.

      Weaknesses:

      (1) Over the course of the task, participants may gradually become aware of their high error rate in the dot estimation task. This could lead them to discount their own judgments and become inclined to rely on the choices of other deciders. It is unclear whether participants in the experiment had the opportunity to observe or inquire about others' choices. This point is important, as the compensatory decision-making process may differ depending on whether choices are made independently or influenced by external input.

      (2) Given the inherent complexity of human decision-making, it is crucial to acknowledge that, although the authors compared eight candidate models, other plausible alternatives may exist. As such, caution is warranted when interpreting the computational modeling results.

      (3) I do not agree with the authors' claim that "computational modeling results indicated that individuals integrate harm and responsibility in the form of a quotient" (i.e., harm/responsibility). Rather, the findings appear to suggest that individuals may form a composite representation of the harm attributable to each individual (i.e., harm/the number of people involved). The explanation of the modeling results ought to be precise.

      (4) Many studies have reported positive associations between trait gratitude, social value orientation, and altruistic behavior. It would be helpful if the authors could provide an explanation about why this study failed to replicate these associations.

      (5) As the authors noted, guilt and shame are closely linked to various psychiatric disorders. It would be valuable to discuss whether this study has any implications for understanding or even informing the treatment of these disorders.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-02879 Corresponding author(s): Matteo Allegretti; Alia dos Santos

      1. General Statements

      In this study, we investigated the effects of paclitaxel on both healthy and cancerous cells, focusing on alterations in nuclear architecture. Our novel findings show that:

      • Paclitaxel-induced microtubule reorganisation during interphase alters the perinuclear distribution of actin and vimentin. The formation of extensive microtubule bundles, in paclitaxel or following GFP-Tau overexpression, coincides with nuclear shape deformation, loss of regulation of nuclear envelope spacing, and alteration of the nuclear lamina.

      • Paclitaxel treatment reduces Lamin A/C protein levels via a SUN2-dependent mechanism. SUN2, which links the lamina to the cytoskeleton, undergoes ubiquitination and consequent degradation following paclitaxel exposure.

      • Lamin A/C expression, frequently dysregulated in cancer cells, is a key determinant of cellular sensitivity to, and recovery from, paclitaxel treatment.

      Collectively, our data support a model in which paclitaxel disrupts nuclear architecture through two mechanisms: (i) aberrant nuclear-cytoskeletal coupling during interphase, and (ii) multimicronucleation following defective mitotic exit. This represents an additional mode of action for paclitaxel beyond its well-established mechanism of mitotic arrest.

      We thank the reviewers for their time and constructive feedback. We have carefully considered all comments and have carried out a full revision. The updated manuscript now includes additional data showing:

      • Overexpression of microtubule-associated protein Tau causes similar nuclear aberration phenotypes to paclitaxel. This supports our hypothesis that increased microtubule bundling directly leads to nuclear disruption in paclitaxel during interphase.

      • Paclitaxel's effects on nuclear shape and Lamin A/C and SUN2 expression levels occur independently of cell division.

      • Reduced levels of Lamin A/C and SUN2 upon paclitaxel treatment occur at the protein level via ubiquitination of SUN2.

      • The effects of paclitaxel on the nucleus are conserved in breast cancer cells.

      Full Revision

      We have also edited our text and added further detail to clarify points raised by the reviewers. We believe that our revised manuscript is overall more complete, solid and compelling thanks to the reviewers' comments.

      1. Point-by-point description of the revisions

      Reviewer #1 Evidence, reproducibility and clarity

      This description of the down-regulation of the expression of lamin A/C upon treatment with paclitaxel and its sensitivity to SUN2 is quite interesting but still somehow preliminary. It is unclear whether this effect involves the regulation of gene expression, or of the stability of the proteins. How SUN2 mediates this effect is still unknown.

      We thank the reviewer for this valuable comment. To elucidate the mechanism behind the decrease in Lamin A/C and SUN2 levels, we have now performed several additional experiments. First, we performed RT-qPCR to quantify mRNA levels of these genes, relative to the housekeeping gene GAPDH (Supplementary Figure 3B and O). The levels of SUN2 and LMNA mRNA remained the same between control and paclitaxel-treated cells, indicating that this effect instead occurs at the protein level. We have also tested post-translational modifications as a potential regulatory mechanism for Lamin A/C and SUN2. In addition to the phosphorylation of Ser404 which we had already tested (Supplementary Figure 3C), we have now included additional Phos-tag gel and Western blotting data showing that the overall phosphorylation status of Lamin A/C is not affected by paclitaxel (Supplementary Figure 3E and F). We also pulled-down Lamin A/C from cell lysates and then Western blotted for polyubiquitin and acetyl-lysine, which showed that the ubiquitination and acetylation states of Lamin A/C are also not affected by paclitaxel (Supplementary Figure 3G-I). However, Western blots for polyubiquitin of SUN2 pulled down from cell lysates showed that paclitaxel treatment results in significant SUN2 ubiquitination (Figure 3M and N). Therefore, we propose that the downregulation of SUN2 following paclitaxel treatment occurs by ubiquitin-mediated proteolysis.

      The roles of free tubulins and polymerized microtubules, and thus the potential role of paclitaxel, need to be uncovered.

      We addressed this important point by using an alternative method to stabilise/bundle microtubules in interphase, namely by overexpressing GFP-Tau, as suggested by reviewer 2. Following GFP- Tau overexpression, large microtubule bundles were observed throughout the cytoplasm (Figure 4A), and this resulted in a significant decrease in nuclear solidity (Figure 4B). Furthermore, in cells where microtubule bundles extensively contacted the nucleus, the nuclear lamina became unevenly distributed and appeared patchy (Figure 4C). This supports our hypothesis that the aberrations to nuclear shape and Lamin A/C localisation in paclitaxel-treated cells are due to the presence of microtubules bundles surrounding the nucleus.

      The doses of paclitaxel at which occur the effects described in the paper are not fully consistent with all the conclusions. Most experiments have been done at 5 nM. However, at this dose the effect of lamin A/C over or down expression on the growth (differences in the slopes of the curves in Figure 4A) are not fully convincing and not fully consistent with the clear effect on viability as well (in addition, duration of treatments before assessing vialbility are not specified). At 1 nM, cell growth is reduced and the rescuing effect of lamin over-expression is much more clear (Fig 4A), and the nucleus deformation clear (Fig 2A) but this dose has no effect on lamin A/C expression (Fig 3C), which questions how lamins impact nucleus shape and cell survival. Cytoskeleton reorganisation in these conditions is not described although it could clarify the respective role of force production (suggested in figure 1) and nuclei resistance (shown in figure 2) in paclitaxel sensitivity.

      We thank the reviewer for raising this important point. We have addressed this by conducting additional repeats for the cell confluency measurements to increase the statistical power of our experiments (Figure 5A). Our data now show that GFP-lamin A/C had a statistically significant effect on rescuing cell growth at both 1 nM and 5 nM paclitaxel, while Lamin A/C knockdown exacerbated the inhibition of cell growth at 5 nM paclitaxel but not 1 nM paclitaxel (Figure 5A). In addition, we note that the duration of paclitaxel treatment before assessing viability was specified in the figure legend: "Bar graph comparing cell viability between wild-type (red), GFP-Lamin A/C overexpression (green), and Lamin A/C knockdown (blue) cells following 20 h incubation in 0, 1, 5, or 10 nM paclitaxel." We also repeated cell viability analysis after 48 h incubation in paclitaxel instead of 20 h to allow for a longer time for differences to take effect (Figure 5B).

      We also added figures showing the cytoskeletal reorganisation at both 1 and 10 nM in addition to 0 and 5 nM (Supplementary Figure 1A) showing that microtubule bundling and condensation of actin into puncta correlated with increased paclitaxel concentration. Vimentin colocalised well with microtubules at all concentrations.

      We have also included in our results section further clarification for the use of 5nM paclitaxel in this study. The new section reads as follows: "Experiments were performed at 5 nM paclitaxel (with additional experiments to determine dose relationships at 1 and 10 nM) because this aligns with previous studies7,14,24. Furthermore, previous analysis of patient plasma reveals that typical concentrations are within the low nanomolar range8, and concentrations of 5-10 nM are required in cell culture to reach the same intracellular concentrations observed in vivo in patient tumours9. This aligns with in vitro cytotoxic studies of paclitaxel in eight human tumour cell lines which show that paclitaxel's IC50 ranges between 2.5 and 7.5 nM41."

      Finally, although the absence of role of mitotic arrest is clear from the data, the defective reorganisation of the nucleus after mitosis still suggest that the effect of paclitaxel is not independent of mitosis.

      We thank the reviewer for pointing out the need for clarification in the wording of our manuscript. We have reworded the title and relevant sections of our abstract, introduction, and discussion to make it clearer that the effects of paclitaxel on the nucleus are due to a combination of aberrant nuclear cytoskeletal coupling during interphase and multimicronucleation following mitotic slippage. We have also added additional data in support of the effect of paclitaxel on nuclear architecture during interphase. For this, we used serum-starved cells (which divide only very slowly such that the majority of cells do not pass through mitosis during the 16 h incubation in paclitaxel [Supplementary Figure 2D]). Our new data confirmed that paclitaxel's effects on nuclear solidity, and Lamin A/C and SUN2 proteins levels can occur independently of cell division (Figure 2C; Figure 3H-J). Finally, when we overexpressed GFP-Tau (as discussed above) we observed similar aberrations to nuclear solidity and Lamin A/C localisation. This indicates that these effects occur due to microtubule bundling in interphase, especially as in our study GFP-Tau did not lead to multimicronucleation or appear to affect mitosis (Figure 4).

      Below are the main changes to the text regarding the interphase effect of paclitaxel:

      • Title: "Paclitaxel compromises nuclear integrity in interphase through SUN2-mediated cytoskeletal coupling"

      • Abstract: "Overall, our data supports nuclear architecture disruption, caused by both aberrant nuclear-cytoskeletal coupling during interphase and exit from defective mitosis, as an additional mechanism for paclitaxel beyond mitotic arrest."

      • Introduction: "Here we propose that cancer cells have increased vulnerability to paclitaxel both during interphase and following aberrant mitosis due to pre-existing defects in their NE and nuclear lamina."

      • Discussion: "Overall, our work builds on previous studies investigating loss of nuclear integrity as an anti-cancer mechanism of paclitaxel separate from mitotic arrest14,20,21. We propose that cancer cells show increased sensitivity to nuclear deformation induced by aberrant nuclear-cytoskeletal coupling and multimicronucleation following mitotic slippage. Therefore, we conclude that paclitaxel functions in interphase as well as mitosis, elucidating how slowly growing tumours are targeted."

      minor: a more thorough introduction of known data about dose response of cells in culture and in vivo would help understanding the range of concentrations used in this study.

      As mentioned above, we have now included additional information in our Results section to clarify our paclitaxel dose range: "Experiments were performed at 5 nM paclitaxel (with additional experiments to determine dose relationships at 1 and 10 nM) because this aligns with previous studies7,14,24. Furthermore, previous analysis of patient plasma reveals that typical concentrations are within the low nanomolar range8, and concentrations of 5-10 nM are required in cell culture to reach the same intracellular concentrations observed in vivo in patient tumours9. This aligns with in vitro cytotoxic studies of paclitaxel in eight human tumour cell lines which show that paclitaxel's IC50 ranges between 2.5 and 7.5 nM41."

      Significance

      In this manuscript, Hale and colleagues describe the effect of paclitaxel on nucleus deformation and cell survival. They showed that 5nM of paclitaxel induces nucleus fragmentation, cytoskeleton reorganisation, reduced expression of LaminA/C and SUN2, and reduced cell growth and viability. They also showed that these effects could be at least partly compensated by the over-expression of lamin A/C. As fairly acknowledged by the authors, the induction of nuclear deformation in paclitaxel-treated cells, and the increased sensitivity to paclitaxel of cells expressing low level of lamin A/C are not novel (reference #14). Here the authors provided more details on the cytoskeleton changes and nuclear membrane deformation upon paclitaxel treatment. The effect of lamin A/C over and down expression on cell growth and survival are not fully convincing, as further discussed below. The most novel part is the observation that paclitaxel can induce the down-regulation of the expression of lamin A/C and that this effect is mediated by SUN2.

      We appreciate the reviewer's summary and thank them for their time. We believe our comprehensive revisions have addressed all comments, strengthening the manuscript and making it more robust and compelling.

      Reviewer #2 Evidence, reproducibility and clarity This study investigates the effects of the chemotherapeutic drug paclitaxel on nuclear-cytoskeletal coupling during interphase, claiming a novel mechanism for its anti-cancer activity. The study uses hTERT-immortalized human fibroblasts. After paclitaxel exposure, a suite of state- of-the-art imaging modalities visualizes changes in the cytoskeleton and nuclear architecture. These include STORM imaging and a large number of FIB-SEM tomograms.

      We thank the reviewer for the summary and for highlighting our efforts in using the latest imaging technical advances.

      Major comments:

      The authors make a major claim that in addition to the somewhat well-described mechanism of paclitaxel on mitosis, they have discovered 'an alternative, poorly characterised mechanism in interphase'.

      However, none of the data proves that the effects shown are independent of mitosis. To the contrary, measurements are presented 48 hours after paclitaxel treatment starts, after which it can be assumed that 100% of cells have completed at least one mitotic event. The appearance of micronuclei evidences this, as discussed by the authors shortly. It looks like most of the results shown are based on botched mitosis or, more specifically, errors on nuclear assembly upon exit from mitosis rather than a specific effect of paclitaxel on interphase. The readouts the authors show just happen to be measurements while the cells are in interphase.

      Alternative hypotheses are missing throughout the manuscript, and so are critical controls and interpretations.

      We thank the reviewer for highlighting the lack of clarity in our wording. We have revised the title, abstract and relevant sections of the introduction and discussion to clarify our message that the effects of paclitaxel on the nucleus arise from a combination of aberrant nuclear-cytoskeletal coupling during interphase and multimicronucleation following exit from defective mitosis. We have also included additional data where we used slow-dividing, serum-starved cells (under these conditions, the majority of cells do not undergo mitosis during the 16 h incubation in paclitaxel [Supplementary Figure 2D]). Our new data show that even in these cells there is a clear effect of paclitaxel on nuclear solidity, and Lamin A/C and SUN2 protein levels, further supporting our hypothesis that these phenotypes can occur independently of cell division (Figure 2C; Figure 3H-J). Furthermore, we performed additional experiments where we used overexpression of GFP-Tau as an alternative method of stabilising microtubules in interphase and observed similar aberrations to nuclear solidity and Lamin A/C localisation. As GFP-Tau overexpression did not lead to micronucleation or appear to affect mitosis, these data support the hypothesis that nuclear aberrations occur due to microtubule bundling in interphase (Figure 4). We discuss these experiments in more detail below. Finally, we have reworded the introduction to better introduce alternative hypotheses and mechanisms for paclitaxel's activity.

      The authors claim that 'Previously, the anti-cancer activity of paclitaxel was thought to rely mostly on the activation of the mitotic checkpoint through disruption of microtubule dynamics, ultimately resulting in apoptosis.' The authors may have overlooked much of the existing literature on the topic, including many recent manuscripts from Xiang-Xi Xu's and another lab.

      We would like to note that the paper from Xiang-Xi Xu's lab (Smith et al, 2021) was cited in our original manuscript (reference 14 in both the original and revised manuscripts). We have now also included additional review articles from the Xiang-Xi Xu lab (PMID:36368286 20 and PMID: 35048083 21). Furthermore, we have clarified the wording in both the introduction and discussion to better reflect the current understanding of paclitaxel's mechanism and alternative hypotheses.

      The data, e.g. in Figure 1, does not hold up to the first alternative hypothesis, e.g. that paclitaxel stabilizes microtubules and that excessive mechanical bundling of microtubules induces major changes to cell shape and mechanical stress on the nucleus. Even the simplest controls for this effect (the application of an alternative MT stabilizing drug or the overexpression of an MT stabilizer, e.g., tau).

      We thank the reviewer for suggesting this control experiment using the microtubule stabiliser Tau. We have now included these experiments in the revised version of the manuscript (Figure 4). The overexpression of GFP-Tau supports our hypothesis that cytoskeletal reorganisation in paclitaxel exerts mechanical stress on the nucleus during interphase, resulting in nuclear deformation and aberrations to the nuclear lamina. In particular, GFP-Tau overexpression resulted in large microtubule bundles throughout the cytoplasm (Figure 4A). Notably, in cells where these bundles extensively contacted the nucleus, we observed a significant decrease in nuclear solidity (Figure 4B) accompanied by changes in nuclear lamina organisation, including a patchy lamina phenotype, similar to that induced by paclitaxel (Figure 4C).

      The focus on nuclear lamina seems somewhat arbitrary and adjacent to previously published work by other groups. What would happen if the authors stained for focal adhesion markers? There would probably be a major change in number and distribution. Would the authors conclude that paclitaxel exerts a specific effect on focal adhesions? Or would the conclusion be that microtubule stabilization and the following mechanical disruption induce pleiotropic effects in cells? Which effects are significant for paclitaxel function on cancer cells?

      We thank the reviewer for raising important points regarding the specificity of paclitaxel's effects. We agree that microtubule stabilisation can induce myriad cellular changes, including alterations to focal adhesions and other cytoskeletal components. Our focus on Lamin A/C and nuclear morphology is grounded both in the established clinical relevance of nuclear mechanics in cancer and builds on mechanistic work from other groups.

      Lamin A/C expression is commonly altered in cancer, and nuclear morphology is frequently used in cancer diagnosis35. Lamin A/C also plays a crucial role in regulating nuclear mechanics32 and, importantly, determines cell sensitivity to paclitaxel14. However, the mechanism by which Lamin A/C determines sensitivity of cancer cells to paclitaxel is unclear.

      Our data are consistent with Lamin A/C being a determinant of paclitaxel survival sensitivity. We also provide evidence that paclitaxel itself reduces Lamin A/C protein levels and disrupts its organisation at the nuclear envelope. We directly link these effects to microtubule bundling around the nucleus and degradation of force-sensing LINC component SUN2, highlighting the importance of nuclear architecture and mechanics to overall cellular function. Furthermore, we show that recovery from paclitaxel treatment depends on Lamin A/C expression levels. This has clinical relevance, as unlike cancer cells, healthy tissue with non-aberrant lamina would be able to selectively recover from paclitaxel treatment.

      Minor comments:

      While I understand the difficulty of the experiments and the effort the authors have put into producing FIB-SEM tomograms, I am not sure they are helping their study or adding anything beyond the light microscopy images. Some of the images may even be in the way, such as supplementary Figure 6, which lacks in quality, controls, and interpretation. Do I see a lot of mitochondria in that slice?

      We agree with the reviewer that Supplementary Figure 6 does not add significant value to the manuscript and thank the reviewer for pointing this out. We have removed it from the manuscript accordingly.

      I may have overlooked it, but has the number of cells from which lamellae have been produced been stated?

      We thank the reviewer for pointing out the missing information. For our cryo-ET experiments, we collected data from 9 lamellae from paclitaxel-treated cells and 6 lamellae from control cells, with each lamella derived from a single cell. This information has now been added to the figure legend (Figure 2F).

      Significance

      The significance of studying the effect of paclitaxel, the most successful chemotherapy drug, should be broad and of interest to basic researchers and clinicians.

      As outlined above, I believe that major concerns about the design and interpretation of the study hamper its significance and advancements.

      We appreciate the reviewer's concerns and have performed major revisions to strengthen the significance of our study. Specifically, we conducted two key sets of experiments to validate our original conclusions: serum starvation to control for the effects of cell division, and overexpression of the microtubule stabiliser Tau to demonstrate that paclitaxel can affect the nucleus via its microtubule bundling activity in interphase.

      By elucidating the mechanistic link between microtubule stabilisation and nuclear-cytoskeletal coupling, our findings contribute to our understanding of paclitaxel's multifaceted actions in cancer cells.

      My areas of expertise could be broadly defined as Cell Biology, Cytoskeleton, Microtubules, and Structural Biology.

      Reviewer #3 Evidence, reproducibility and clarity The manuscript presents interesting new ideas for the mechanism of an old drug, taxol, which has been studied for the last 40 years.

      We thank the reviewer for the positive feedback.

      Although similar ideas are published, which may be suitable to be cited? • Paclitaxel resistance related to nuclear envelope structural sturdiness. Smith ER, Wang JQ, Yang DH, Xu XX. Drug Resist Updat. 2022 Dec;65:100881. doi: 10.1016/j.drup.2022.100881. Epub 2022 Oct 15. PMID: 36368286 Review. • Breaking malignant nuclei as a non-mitotic mechanism of taxol/paclitaxel. Smith ER, Xu XX. J Cancer Biol. 2021;2(4):86-93. doi: 10.46439/cancerbiology.2.031. PMID: 35048083 Free PMC article.

      We thank the reviewer for bringing to our attention these important review articles. In our initial manuscript, we only cited the original paper (14, also reference 14 in the original manuscript). We have now included citations to the suggested publications (20,21).

      We would also like to emphasise how our manuscript distinguishes itself from the work of Smith et al.14,20,21:

      • Cell-type focus: In their study 14, Smith et al. examined the effect of paclitaxel on malignant ovarian cancer cells and proposed that paclitaxel's effects on the nucleus are limited to cancer cells. However, our data extends these findings by demonstrating paclitaxel's effects in both cancerous and non-cancerous backgrounds.

      • Cytoskeletal reorganisation: Smith et al. show reorganisation of microtubules in paclitaxel-treated cells14. Our data show re-organisation of other cytoskeletal components, including F-actin and vimentin.

      • Multimicronucleation: Smith et al. propose that paclitaxel-induced multimicronucleation occurs independently of cell division14. Although we observe progressive nuclear abnormalities during interphase over the course of paclitaxel treatment, our data do not support this conclusion; we find that multimicronucleation occurs only following mitosis.

      • Direct link between microtubule bundling and nuclear aberrations: We show that nuclear aberrations caused by paclitaxel during interphase (distinct from multimicronucleation) are directly linked to microtubule bundling around the nucleus, suggesting they result from mechanical disruption and altered force propagation.

      • Lamin A/C regulation: Consistent with Smith et al.14, we show that Lamin A/C depletion leads to increased sensitivity to paclitaxel treatment. However, we further demonstrate that paclitaxel itself leads to reduced levels of Lamin A/C and that this effect occurs independently of mitosis and is mediated via force-sensing LINC component SUN2. Upon SUN2 knockdown, Lamin A/C levels are no longer affected by paclitaxel treatment.

      • Recovery: Finally, our work reveals that cells expressing low levels of Lamin A/C recover less efficiently after paclitaxel removal. This might help explain how cancer cells could be more susceptible to paclitaxel.

      Only one cell line was used in all the experiments? "Human telomerase reverse transcriptase (hTERT) immortalised human fibroblasts" ? The cells used are not very relevant to cancer cells (carcinomas) that are treated with paclitaxel. It is not clear if the observations and conclusions will be able to be generalized to cancer cells.

      We thank the reviewer for this comment. Our initial study aimed to understand the effects of paclitaxel on nuclear architecture in non-aberrant backgrounds. To show that the observed effects of paclitaxel are also applicable to cancer cells, we have now repeated our main experiments using MDA-MB-231 human breast cancer cells (Supplementary Figure 1B; Supplementary Figure 3P-T). Similar to our findings in human fibroblasts, paclitaxel treatment of MDA-MB-231 led to cytoskeletal reorganisation (Supplementary Figure 1B), a decrease in nuclear solidity (Supplementary Figure 3P), aberrant (patchy) localisation of Lamin A/C (Supplementary Figure 3Q), and a reduction in Lamin A/C and SUN2 levels (Supplementary Figure 3R-T).

      "Fig. 1. (B) STORM imaging of α-tubulin immunofluorescence in cells fixed after 16 h incubation in control media or 5 nM paclitaxel. Lower panels show α-tubulin clusters generated with HDBSCAN analysis. Scale bars = 10 μm." It needs explanation of what is meaning of the different color lines in the lower panels, just different filaments?

      We have added further detail to the figure legend for clarification: "Lower panels show α-tubulin clusters generated with HDBSCAN analysis. Different colours distinguish individual α-tubulin clusters, representing individual microtubule filaments or filament bundles."

      Generally, the figures need additional description to be clear.

      We have added further clarification and detail to our figure legends.

      "Figure 3 - Paclitaxel results in aberrations to the nuclear lamina." The sentence seems not to be well constructed. "Paclitaxel treatment causes ..."?

      We changed this sentence to: "Figure 3 - Paclitaxel treatment results in aberrant organisation of the nuclear lamina and decreased Lamin A/C levels via SUN2."

      Lamin A and C levels are different in different images (Fig. 3B, H): some Lamin A is higher, and sometime Lamin C is higher? This may possibly due to culture condition or subtle difference in sample handling?.

      We thank the reviewer for pointing this out and we agree that the ratio of Lamin A to Lamin C can vary with culture conditions. To confirm that paclitaxel treatment reduces total Lamin A/C levels regardless of this ratio, we repeated the Western blot analysis in three additional biological replicates using cells in which Lamin C levels exceeded Lamin A levels. These experiments confirmed a comparable decrease in total Lamin A/C levels. Figure 3B and 3C have been updated accordingly.

      Also, the effect on Lamin A/C and SUN2 levels are not significant of robust.

      Decreased Lamin A/C and SUN2 levels following paclitaxel treatment were consistently seen across three or more biological repeats (Figure 3B-C), and this could be replicated in a different cell type (MDA-MB-231) (Supplementary Figure 3R-T). Furthermore, Western blotting results are consistent with the patchy Lamin A/C distribution observed using confocal and STORM following paclitaxel treatment (Figure 3A; Supplementary Figure 3A), where Lamin A/C appears to be absent from discrete areas of the lamina.

      Any mechanisms are speculated for the reason for the reduction?

      We have now included additional data which aims to shed light on the mechanism behind the decrease in Lamin A/C and SUN2 levels following paclitaxel treatment. We found that SUN2 is selectively degraded during paclitaxel treatment. Immunoprecipitation of SUN2 followed by Western blotting against Polyubiquitin C showed increased SUN2 ubiquitination in paclitaxel (Figure 3M and N). Furthermore, in our original manuscript, we showed that Lamina A/C levels remained unaltered during paclitaxel treatment in cells where SUN2 had been knocked down. We propose that changes in microtubule organisation affect force propagation to Lamin A/C specifically via SUN2 and that this leads to Lamina A/C removal and depletion. Future work will be needed to fully understand this mechanism.

      In addition to the findings described above, we report no significant changes in mRNA levels for LMNA or SUN2 in paclitaxel (Supplementary Figure 3B and O). Phos-tag gels followed by Western blotting analysis for Lamin A/C also did not detect changes to the overall phosphorylation status of Lamin A/C due to paclitaxel treatment. This is in agreement with our initial data showing no changes to Lamin A/C Ser 404 phosphorylation levels (Supplementary Figure 3E and F). Finally, Lamin A/C immunoprecipitation experiments followed by Western blotting for Polyubiquitin C and acetyl-lysine showed no significant changes in the ubiquitination and acetylation state of Lamin A/C in paclitaxel-treated cells (Supplementary Figure 3G-I).

      Also, the about 50% reduction in protein level is difficult to be convincing as an explanation of nuclear disruption.

      The nuclear lamina and LINC complex proteins play a critical role in regulating nuclear integrity, stiffness and mechanical responsiveness to external forces28,31-33,54,75, as well as in maintaining the nuclear intermembrane distance69,74. In particular, SUN-domain proteins physically bridge the nuclear lamina to the cytoskeleton through interactions with Nesprins, thereby preserving the perinuclear space distance30,69,74. Mutations in Lamins have been shown to disrupt chromatin organization, alter gene expression, and compromise nuclear structural integrity, and experiments with LMNA knockout cells reveal that nuclear mechanical fragility is closely coupled to nuclear deformation47. Furthermore, nuclear-cytoskeletal coupling is essential during processes such as cell migration, where cells undergo stretching and compression of the nucleus; weakening or loss of the lamina in such cases compromises cell movement47,73. In our work, we show that alterations to nuclear Lamin A/C and SUN2 by paclitaxel treatment coincide with nuclear deformations (Figure 2A-D, F, G; Figure 3A-D, F, G; Supplementary Figure 3A, P-T) and that these deformations are reversible following paclitaxel removal (Supplementary Figure 4B-D). Our experiments also demonstrate that Lamin A/C expression levels significantly influence cell growth, cell viability, and cell recovery in paclitaxel (Figure 5). Therefore, drawing on current literature and our results, we propose that, during interphase, paclitaxel induces severe nuclear aberrations through the combined effects of: i) increased cytoskeletal forces on the NE caused by microtubule bundling; ii) loss of ~50% Lamin A/C and SUN2; iii) reorganisation of nucleo-cytoskeletal components.

      Significance

      The manuscript presents interesting new ideas for the mechanism of an old drug, taxol, which has been studied for the last 40 years.

      The data may be improved to provide stronger support.

      Additional cell lines (of cancer or epithelial origin) may be repeated to confirm the generality of the observation and conclusions.?

      We thank the reviewer for the feedback and valuable suggestions. In response, we have included experiments using human breast cancer cell line MDA-MB-231 to further corroborate our findings and interpretations. We believe these additions have improved the clarity, robustness and impact of our manuscript, and we are grateful for the reviewer's contributions to its improvement.

    1. Reviewer #2 (Public review):

      Summary:

      In this work, Ganesh and colleagues use experimental data from Hi-C and from live-cell imaging to evaluate different polymer models of 3D genome organization in Drosophila based on both structural and dynamic properties. The authors consider several leading hypotheses, which are examined sequentially in increasing level of complexity - from the minimal Rouse polymer, to a model combining sequence-specific compartmentalization and loop-extrusion without extrusion blockers. They conclude that the combination of both compartmentalization and loop-extrusion gives the best agreement with the data. Their analysis also leads to concrete predictions about the processivity of cohesin loop extrusion in Drosophila, and a conclusion that the compartmental interaction strength is poised near criticality in the coil-globule phase space.

      Strengths:

      There is considerable interest in the field in understanding the mechanisms responsible for the 3D spatial organization genome and the dynamic movement of the genome, which has major implications for our understanding of long-range transcriptional regulation and other genome behaviors. The live-cell experimental work on which this study draws highlights the limitations of existing models to explain even the dynamic behaviors observed in the data, further exciting interest in further exploration. Therefore, this paper seeks to address an important gap in the field. The work is written in a well-organized, well-illustrated fashion. The text and figures are nicely integrated, easy to read, and explain challenging concepts with elegance and brevity in a manner that will be accessible to a broad audience.

      Weaknesses:

      The validity and utility of these conclusions are, in my view, substantially undermined by what appears to be unappreciated peculiarities of the live-cell data set that was used to constrain the model. The live-cell data comes from embryos were edited in a way that intentionally substantively changed both the 3D genome structure and dynamics specifically at the loci which are imaged, a case which is not at all explained by any of the models suggested nor acknowledged in the current work, nor compatible with the Hi-C data that simultaneously used to explain these models. As these ignored synthetic alterations have been previously shown to be determinative of transcriptional activity, the relevance of the author's work to transcriptional control (a prime motivation in the introduction) is unclear.

      The agreement in 3D organization, as represented in chromosome-scale contact frequency heatmaps, is substantially less impressive than the agreement seen in prior work with similar models. This discrepancy appears to be due in part to the unappreciated effects of the mentioned in the previous limitation, as well as inappropriate choices in metrics used to evaluate agreement. It is also not particularly surprising that combining more models, with more free parameters, results in an improvement in the quality of fit.

      Some major results, including both theoretical works and experimental ones, are ignored, despite their relevance to the stated objective of the work. The current manuscript and analysis could be improved substantially by a consideration of these works.

      I describe these issues in more detail below.

      Major issues:

      (1) The genetic element "homie" is present in a subset of the data: The experimental data used in this analysis come from different fly lines, half of which have been edited explicitly to alter genome structure and consequent transcriptional behavior, yet the authors are trying to fit with a common model - a problem which substantially undermines the utility of the analysis.

      Specifically, the authors evaluate the various models/simulations by comparing them to Hi-C from wildtype Drosophila embryos on the chromosome scale and 3D distances and dynamics from live cell imaging in genetically edited embryos, to a series of models in turn. The exercise fatally overlooks a critical fact, (admittedly not easily noticed in the work from Bruckner et al), that the fly embryos used for nearly all their analyses contain not only fluorescent labels, but also contain two copies of a powerful genetic sequence, "homie", known for its ability to dramatically change the 3D organization and dynamics of the genome. Whether or not the fluorescent labels themselves used in the study further alter structure and dynamics is not entirely clear (and will require further work beyond the scope of either study), but at least these fluorescent labels aren't known to dramatically affect 3D structure and dynamics the way homie is. The critical problem is that adding or removing the "homie", as shown in a collection of prior works I describe below in more detail, dramatically affects structure, dynamics, and gene expression. Whether or not the genome contains two distal cis-linked copies of homie fundamentally changes genome structure and dynamics, so to use one dataset which has this edit (the live-cell data) and one dataset which lacks it (the Hi-C data) is, in some sense, to guarantee failure of any model to match all the data.

      If the authors had chosen instead to focus exclusively on the 'no homie' genetic lines in the Brukner data, they would have a much smaller dataset (just 2 distances), which would not cover all the length scales of interest, but it would at least be a dataset not known to be contradictory to the Hi-C. The two 'no homie' lines make much more plausible candidates for the sort of generalizable polymer dynamics these authors seek to explain, as will hopefully be made more clear by a brief review of what is known about homie. I next describe the published data that support these conclusions about how homie affects 3D genome spatial organization and dynamics:

      What is "homie" and how does it affect 3D genome distances, dynamics, and gene expression?

      The genetic element "homie" was named by James Jaynes' lab ( Fujioka...Jaynes 2009) in reference to its remarkable "homing" ability - a fascinating and still poorly understood biological observation that some genetic sequences from Drosophila, when cloned on plasmids and reintegrated into the genome with p-elements, had a remarkable propensity to re-integrate near their endogenous sequence, (Hama et al., 1990; Kassis, 2002; Taillebourg and Dura, 1999; Bender and Hudson, 2000; Fujioka...Jaynes 2009). By contrast, most genetic elements tend to incorporate at random across the genome in such assays (with some bias for active chromatin).

      The Jaynes lab subsequently showed that flies carrying two copies of homie, one integrated in cis, ~140 kb distal from the endogenous element, formed preferential cis contacts with one another. Indeed, if a promoter and reporter gene were included at this distal integration site, the reporter gene would activate gene expression in the pattern normally seen by the gene, even-skipped. The endogenous copy of homie marks one border of ~16 kb mini-TAD which contains the even-skipped gene, (eve), and its developmental enhancers, so this functional interaction provides further evidence of physical proximity (as was also shown by 3C by Jaynes (Fujioka..., Schedl, Jaynes 2016), and later with elegant live imaging, by Jaynes and Gregor (Chen 2018)).

      Critically, if either copy of homie is deleted or substantially mutated, the 3D proximity is lost (Fujioka 2016, Chen 2018, Bruckner 2023), and the expression of the transgene is dramatically reduced (at 58 kb) or lost. Given the author's motivation of understanding "E-P" interactions, the fact that the increased 3D proximity provided by homie is as essential for transcription as the promoter itself at the ~150 kb distance, underscores that these are not negligible changes.

      These effects can be seen by plotting the data from Bruckner 2023, which includes data from labels with separations of 58 kb and ~150 kb "no homie" as well as homie. Unfortunately, the authors don't plot this data in the manuscript in the comparison of 3D distances, though the two-point MSD can be seen in Figure S13C, and laudably, the data is made public in a well-annotated repository on Zenodo, noted in the study. Note that the distance data in Figure S13 were filtered to exclude the transcriptionally off state, and are thus not the quantity the current authors are interested in. If they plot the published data for no homie, they will see the clear effect on the average 3D distance, R(s), and a somewhat stronger effect on the contact frequency P(s), which causes significant deviation from the trend-line followed by the homie-containing data.

      (2) The agreement between the "best performing" simulations for all models and the Hi-C data is not on par with prior studies using similar approaches, apparently due to some erroneous choices in how the optimization is carried out:

      Hi-C-comparison

      The 'best fit' simulation Hi-C looks strikingly different from the biological data in all comparisons, with clearly lower agreement than other authors have shown using highly similar methods (e.g., Shi and Thirumalai 2023; Di Pierro et al. 2017; Nuebler et al. 2018; Esposito et al. 2022; Conte et al. 2022), among many others. I believe this results from a few issues with how the current authors select and evaluate the data in their work:

      (a) Most works have used Pearson's correlation rather than Spearman's correlation when comparing simulation and Hi-C contact frequencies. Pearson's correlation is more appropriate when we expect the values to be linearly related, which they should be in this case, as they are constructed indeed to be measuring the same thing (contact frequency), just derived from two different methods. Spearman's correlation would have been justifiable for comparing how transcription output correlates with contact frequency. This may fix the bafflingly low correlations reported at lower adhesion values in Figure S2C.

      (b) Choice of adhesion strengths - The Hi-C map comparison in Figure 3 strongly suggests that a much more striking visual agreement would have been achieved if much weaker (but still non-zero) homotypic monomer affinity had been selected. In the authors' simulation, the monomer state (A/B identity) strongly dominates polymer position, resulting in the visual appearance of an almost black-and-white checkerboard. The data, meanwhile, look like a weak checkerboard superimposed on the polymer.

      (c) A further confounding problem is the aforementioned issue that the Hi-C data don't come from the edited cell lines, and that the interaction of the two Homie sites is vastly stronger than the compartment interactions of this region of the genome.

      (3) Some important concepts from the field are ignored:

      The crumpled/fractal globule model is widely discussed in the literature (including the work containing the data used in this study) - its exclusion from this analysis thus appears as a substantial gap/oversight:

      A natural alternative to the much-discussed Rouse polymer model is the "crumpled polymer" (Grosberg et al. 1988; Grosberg 2016; Halverson et al. 2011; Halverson et al. 2011), also known as the "fractal globule" (Lieberman-Aiden et al. 2009; Mirny 2011; Dekker and Mirny 2016; Boettiger et al. 2016), much discussed for the way it captures the ⅓ scaling of R(s), found for much of the genome (or, equivalently, the -1 exponent of the probability of contact as a function of genome separation, P(s)). Given the 1/3rd scaling in the data, and the fact that the original authors highlighted the crumpled model in addition to the Rouse model, it seems that this comparison would be instructive and the lack of discussion an oversight. Moreover, while prior works (e.g., Buckner, Gregor, 2023) used some traditional simplifying assumptions to estimate the MSD and relaxation time scaling of this model, I believe a more rigorous analysis with explicit simulations (as in Figure 1 for the Rouse model) would be instructive for the crumpled polymer simulations. Note the crumpled globule is not necessarily the same as the globule in the coil-globule transition discussed here - it requires some assumptions about non-entanglement to stay trapped in the meta-stable state which has the 1/3rd R(s) scaling that is indicative of this model, and not the 1/2 exhibited by equilibrium globules (for s<< length of the polymer) and dilute polymers alike.

      While the fit in Figure 2 appears to get closer to the 1/3rd exponent (B= 0.32), this appears to be a largely coincidental allusion of agreement - the simulation data in truth shows a systematic deviation, returning to the 1/2 scaling for distances from 500 kb to whole chromosomes. This feature is not very evident as the authors restrict the analysis to only the few points available in the experimental data, though had they tested intervening distances I expect they would show log-log P(s) is nonlinear (non-powerlaw) for distances less than the typical loop length up to a few fold larger than the loop length, and thereafter returns to the scaling provided by the 'base' polymer behavior. This appears to be Rouse-like in these authors' model, with R(s) going like 1/2, even though the data are closer to 1/3rd, as indeed most published simulated P(s) curves based on loop extrusion - e.g., (Fudenberg et al. 2016; Nuebler et al. 2018). In this vein, it would be instructive to the readers if the authors would include additional predictions from the simulation on the plot that lie at genomic separation distances not tested in the data, to better appreciate the predictions.

      Minor issues

      (1) I think it is too misleading to only describe the experimental data from Brukner as "E-P" interactions from Drosophila. It is important to note somewhere that this is not an endogenous interaction with a functional role in Drosophila - it is a synthetic interaction between enhancers in the vicinity of the eve gene and a synthetic promoter placed at a variable distance away. The uniformity is elegant - (it is the same pair of elements being studied at all distances), but also provides limited scope for generalization as suggested by the current text. Moreover, the enhancers were not directly labeled; rather, the 3D position of nascent RNA transcribed from eve was tracked with an RNA-binding protein and used as a proxy for the 3D position of the enhancers. There is not an individual enhancer at the eve locus that interacts with the transgene, but rather a collection of enhancers is distributed at different positions throughout the entire TAD, which contains eve, and must form separate loops to reach eve. Indeed, it was previously reported that differences in the local position of these enhancers, relative to eve, affect their ability to interact with the distal reporter gene and the endogenous eve gene (Chen 2018). There is also reported competition between these enhancers and the distal gene, which further complicates the analysis (especially since the state of eve and of its enhancers varies among the different cells as a function of stripe position) - see Chen 2018. All of this is ignored in the current work, despite the assertion of the application to understanding E-P interaction. A detailed discussion of these issues is not necessary, but I fear that ignoring them entirely is to invite further confusion and error.

      (2) I believe this sentence is overstated, given available data: " TAD borders are characterized by transitions between epigenetic states rather than by preferentially-bound CTCF [4, 23, 24]." Indeed, this claim has been repeatedly made in the literature as cited here. However, other data clearly demonstrate a strong enrichment of CTCF at TAD borders (and at epigenetic borders, which in Drosophila have a high correspondence with TAD borders, as the authors have already appropriately noted). See, for example, Figure 4 of Sexton Cell 2012, and compare to Figure 2 of Dixon 2012. Of minor note, CTCF peaks co-occupied by the Zinc Finger TF CP190 are more likely to be TAD borders than CTCF alone. How big a species-specific difference this is remains unclear, as it appears some mammalian CTCF-marked TAD boundaries may be co-occupied by additional ZNFs. While plenty of Drosophila TAD boundaries indeed lack CTCF, many are marked by CTCF, this is enriched relative to what would be expected by chance (or relative to the alignment of other TFs, like Twist or Eve with TAD boundaries), and it has been shown that CTCF loss is sufficient to remove a subset of these, see for example Figure 5 of (Kaushal et al. 2021) (though it is possible, most will require mutation of the all the border-associated factors that collectively bind many of the borders, dCTCF, CP190, mod(mdg4) and others).

      (3) This assertion is overstated given available data: "Although TAD boundaries in Drosophila are often associated with insulator proteins [20], there is no direct evidence that these elements block LEFs in vivo. Therefore, we did not impose boundary constraints in our simulations; LEFs were allowed to move freely unless stalled by collisions with other LEFs, with the possibility of crossover.". Deletion of insulator in Drosophila that lie within a common epigenetic state leads to fusion of TADs (e.g., Mateo et al., 2019 - deletion of the CTCF-marked Fub insulator, in posterior tissues where both flanks of Fub are active; Kaushal, 2021, has examples as well). Loss of CTCF causes a small number of TADs to fuse as measured by Hi-C. This is far from 'direct evidence that insulators block LEFs' - as the authors have already noted, even the idea that cohesin extrudes loops in Drosophila in the first place is indeed controversial. However, LEF activity and stalling at insulators would provide a very natural explanation of why chromatin in a shared epigenetic state should form distinct TADs, and why these TADs should fuse upon insulator deletion. Justifying the lack of stalling sites based on empirical data is thus not very convincing to this reviewer. I believe it would be more apt to simply describe this as a simplifying assumption, rather than the above phrase, which may be misleading.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary

      In this investigation Kapustin et al. demonstrate that vascular smooth muscle cells (VSMCs) exposed to the extracellular matrix fibronectin stimulates the release of small extracellular vesicles (sEVs). The authors provide experimental evidence that stimulation of the actin cytoskeleton boosts sEV secretion and posit that sEVs harbor both fibronectin and collagen IV protein themselves which also, in turn, alter cell migration parameters. It is well established that fibronectin is associated with increased cell migration and adherence; therefore, this association with VSMCs is not novel.

      The reviewer is correct that FN has been associated with migration and adherence in previous studies.  However we have extended these observations to show that the extracellular fibronectin matrix stimulates small extracellular vesicle (sEVs) secretion by modulating the actin cytoskeleton. We also showed that sEVs are trapped in the extracellular matrix and that by presenting collagen VI induce early focal adhesion formation, reduce excessive cellular spreading and guide cell invasion directionality though a 3D matrix. Hence, sEVs mediate cell-matrix cross talk and change cell behaviour in the context of fibronectin matrix. This is critically important for vasculature where regulated VSMC invasion is essential for repair with its deregulation leading to pathology.

      The authors purport that sEV are largely born of filopodia origin; however, this data is not well executed and seems generally at odds with the presented data.

      Our experimental data showed that CD63 MVs are associated with filopodia in fixed and live cells (Fig 2E, 2F and Video S1) and that inhibition of filopodia formation using the formin inhibitor, SMIFH2 reduced sEV secretion on FN (Fig 2B). However, we agree with the reviewer that further studies are required to connect sEV secretion to filopodia.  To address this we have provided further data analysis but also toned down our conclusions regarding this point: . Changes include:

      (1) Title: Matrix-associated extracellular vesicles modulate smooth muscle cell adhesion and directionality by presenting collagen VI.

      (2) Results, section title: 2. FN-induced sEV secretion is modulated by Arp2/3 and formin-dependent actin cytoskeleton remodelling

      (3) Results, page 6 Line 27-44 and conclusion page 7, Ln 3 “Interestingly, CD63+ MVBs can be observed in filopodia-like structures suggesting that sEV secretion can also occur spatially via cellular protrusion-like filopodia but more studies are needed to confirm this hypothesis.”

      (4) Discussion, page 12, line 19. “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”

      Similarly, the effect of sEVs on parameters of cell migration has almost no magnitude of effect, making mechanism exploration somewhat nebulous.

      VSMC are mesenchymal-type cells with a low migration rate and we agree that the changes in the motility are not of great magnitude even for the positive controls suggesting that this is a complex, multifactorial process for VSMCs. In our experiments we collected data from >5000 individual cells to measure the average speed and found that fibronectin matrix on its own increased VSMC speed from ~0.61 um/min to ~0.68 μm/min (~12% raise) which was statistically significant (Fig 5A). Addition of a sEV inhibitor caused a modest but significant decrease in cellular speed. Interestingly, addition of ECM-associated sEVs did not influence cell speed in 2D or 3D assays. However in a 3D model we observed a 22% change in cell directionality (Fig 5G) and  a 235% change in cell alignment index (FMI, Fig 5H) which we believe is very strong evidence that VSMC-derived sEVs are involved in a regulation of VSMC invasion directionality.  These data are also in agreement with sEV effects in tumour cells (Sung et al., 2015) though this previous study did not identify the factor driving the directionality and we think our Collagen VI data extends significantly these previous observations. 

      Results, page 9: “Hence, ECM-associated sEVs have modest influence on VSMC speed but influence VSMC invasion directionality.”.  

      Lastly, the proposed mechanism of VSMCs responding to, and depositing, ECM proteins via sEVs was not rigorously executed; again, making the conclusions challenging for the reader to interpret.

      We appreciate the reviewer’s comment regarding the mechanistic aspects of VSMCs responding to and depositing ECM proteins via sEVs. In our revised manuscript, we have expanded the data demonstrating that sEVs can be retained within the extracellular matrix (see Figs 3A, 3B, S3A, S3B). Additionally, we show that collagen VI is present on the surface of sEVs, where it may modulate cell adhesion and influence the directionality of cell invasion (Fig 7E). Our results further indicate that both fibronectin (FN) and collagen VI can be recycled through multivesicular bodies (see Figs S3C, S3D, S3E–S3G). However, we acknowledge that the precise mechanisms governing the selective loading of ECM proteins onto sEVs, as well as the specific contributions of sEVs to overall ECM organization, remain to be fully elucidated and warrant further investigation. Based on our current evidence, we propose that collagen VI–loaded sEVs act primarily in a signaling capacity by modulating focal adhesion formation but are not directly involved in ECM structural remodeling.

      Results, page 7: To quantify ECM-trapped sEVs we applied a modified protocol for the sequential extraction of extracellular proteins using salt buffer (0.5M NaCl) to release sEVs which are loosely-attached to ECM via ionic interactions, followed by 4M guanidine HCl buffer (GuHCl) treatment to solubilize strongly-bound sEVs (Fig S3A) [42]. We quantified total sEV and characterised the sEV tetraspanin profile in conditioned media, and the 0.5M NaCl and GuHCl fractions using ExoView. The total particle count showed that EVs are both loosely bound and strongly trapped within the ECM. sEV tetraspanin profiling showed differences between these 3 EV populations.  While there was close similarity between the conditioned media and the 0.5M NaCl fraction with high abundance of CD63+/CD81+ sEVs as well as CD63+/CD81+/CD9+ in both fractions (Fig S3A). In contrast, the GuHCl fraction was particularly enriched with CD63+ and CD63+/CD81+ sEVs with very low abundance of CD9+ EVs (Fig S3A). The abundance of CD63+/CD81+ sEVs was confirmed independently by a CD63+ bead capture assay in the media and loosely bound fractions (Fig S3B).

      Results, page 7: We previously found that the serum protein prothrombin binds to the sEV surface both in the media and MVB lumen showing it is recycled in sEVs and catalyses thrombogenesis being on the sEV surface43. So we investigated whether FN can also be associated with sEV surface where it can be directly involved in sEV-cell cross-talk43.   We treated serum-deprived primary human aortic VSMCs with FN-Alexa568 and found that it was endocytosed and subsequently delivered to early and late endosomes together with fetuin A, another abundant serum protein that is a recycled sEV cargo and elevated in plaques (Figs S3C and S3D). CD63 visualisation with a different fluorophore (Alexa488) confirmed FN colocalization with CD63+ MVBs (Fig S3E). Next, we stained non-serum deprived VSMC cultured in normal growth media (RPMI supplemented with 20% FBS) with an anti-FN antibody and observed colocalization of CD63 and serum-derived FN.  Co-localisation was reducd likely due to competitive bulk protein uptake by non-deprived cells (Fig S3F). Notably, when we compared FN distribution in sparsely growing VSMCs versus confluent cells we found that FN intracellular spots, as well as colocalization with CD63, completely disappeared in the confluent state (Fig S3F and S3G). This correlated with nearly complete loss of CD63+/CD81+ sEV secretion by the confluent cells indicating that confluence abrogates intracellular FN trafficking as well as sEV secretion by VSMCs (Fig S3H). Finally, FN could be co-purified with sEVs from VSMC conditioned media (Fig S3I) and detected on the surface of sEVs by flow cytometry confirming its loading and secretion via sEVs (Fig 3C).

      Results: page 10  Collagen VI was the most abundant protein in VSMC-derived sEVs (Fig 7B, Table S7) and  was previously implicated in the interaction with the proteoglycan NG2[53] and suppression of cell spreading on FN[54]. To confirm the presence of collagen VI in ECM-associated sEVs we analysed sEVs extracted from the 3D matrix using 0.5M NaCl treatment and showed that both collagen VI and FN are present (Fig 7D). Next, we analysed the distribution of collagen VI using dot-blot. Alix staining was bright only upon permeabilization of sEV indicating that it is preferentially a luminal protein (Fig 7E). On the contrary, CD63 staining was similar in both conditions showing that it is surface protein (Fig 7E). Interestingly, collagen VI staining revealed that 40% of the protein is located on the outside surface with 60% in the sEV lumen (Fig 7E). 

      Discussion page 12. “In fact, we observed that an extensive secretion of sEVs effectively ceased protrusion activity; also VSMCs acquired a rounded morphology when “hovering” over the FN matrix decorated with sEVs (data not shown). Hence, it will be interesting in future studies to investigate whether sEVs can stimulate Rho activity by presenting adhesion modulators—particularly collagen VI—on their surface, thereby guiding cell directionality during invasion..”

      Discussion, page 14 “In summary, cooperative activation of integrin signalling and F-actin cytoskeleton pathways results in the secretion of sEVs which associate with the ECM and play a signalling role by controling FA formation and cell-ECM crosstalk. Further studies are needed to test these mechanisms across various cell types and ECM matrices.     

      Strengths

      The authors provide a comprehensive battery of cytoskeletal experiments to test how fibronectin and sEVs impact both sEV release and vascular smooth muscle cell migratory activation.

      We appreciate this comment reflecting our efforts to apply a range of orthogonal methods to show the role of the integrin/actin cytoskeleton in ECM-stimulated sEV secretion.

      Weaknesses

      Unfortunately, this article suffers from many weaknesses. First, the rigor of the experimental approach is low, which calls into question the merit of the conclusions. In this vein, there is a lack of proper controls or inclusion of experiments addressing alternative explanations for the phenotype or lack thereof.

      We acknowledge this comment and agree that there was not sufficient evidence to conclude that sEV secretion occurs via filopodia despite the microscopy/inhibitory data so this claim has now been excluded from the study. However we believe that our experimental data does clearly show that FN stimulates the secretion of collagenVI-loaded sEVs which are trapped by the ECM and have the capacity to modulate VSMC adhesion and invasion directionality. To support this, we have now extended the dataset in the revised version:

      (1) In addition to the use of inhibitors and live cell analysis we have added quantitative data confirming that a large proportion of CD63+ endosomes are associated with F-actin/cortactin tails and this colocalization is increased upon the inhibition of sEV secretion with 3-OMS (Fig  2D, Fig S2B).

      (2) We developed a method to extract ECM-associated sEVs and quantified/characterized these using ExoView Assays further confirming significant sEV entrapment by the ECM (Figs 3B, S3A, S3B).    

      (3) We extended the controls to confirm FN delivery to CD63+ endosomes and showed that FN recycling is stopped upon reaching cell confluence (Figs S3F, S3G and Fig S3H).

      (4) We included more intensive characterisation of human atherosclerotic plaque morphology (H&E, Masson’s trichrome staining, Orcein, elastin fibers staining) to confirm predominant accumulation of sEV in the neointima (Figs S4A, S4B and S4C). We also excluded an endothelial origin for the  CD81+ sEVs (Fig 4G).

      (5) We included individual cellular tracks to the 2D migration analysis to confirm the statistical significance and concluded that ECM-associated sEVs regulate cell invasion directionality but not the cell speed (Figs 5A and 5B).

      (6) We showed surface localisation of collagen VI on sEVs confirming that it can activate signalling pathways leading to early FA formation on the FN matrix  (Figs 7D and 7E).

      (7) We included alternative explanations for some of our data in the discussion.      

      Reviewer #2 (Public Review):

      Extracellular vesicles have recently gained significant attention across a wide variety of fields, and they have therefore been implicated in numerous physiological and pathophysiological processes. When such a discovery and an explosion of interest occur in science, there is often much excitement and hope for answers to mechanisms that have remained elusive and poorly understood. Unfortunately, there is an equal amount of hype and overstatement that may also be put forth in the name of "impact", but this temptation must be avoided so that scientists and the broader public are not misled by overreaching interpretations and statements that lack rigorous and fully convincing evidence.

      Thank you for your comment and we agree that investigating sEVs is particularly challenging due to the their heterogeneity and nano-size, as well as complex biogenesis mechanisms. ECM-associated sEVs is a very new direction for the EV field but one that is particularly relevant to the vasculature where cells must invade through a thick ECM and where the accumulation of ECM-bound EVs is a unique and documented phenomenon.  To further strengthen out conclusions we have included new data to support our statements but also excluded statements re: filopodia as the origin of sEVs, that are out of scope of our study and need to be investigated further.

      The study presented by Kapustin et al. is certainly intriguing and timely, and it offers an interesting working hypothesis for the fields of extracellular vesicles and vascular biology to consider. The authors do a reasonable job at detecting these small extracellular vesicles, though some aspects of data presentation are missing such as full Western blots with accompanying size markers for the viewer to more fully appreciate that data and comparisons being made (see Figures 1 and 7).

      We agree with the reviewer and have now included molecular weight markers (Fig 1F, 7C, 7D, S3I, S4E) and provided all original western blot scans (uncropped and unedited) to the eLife editor. 

      Much of the imaging data from cell-based experiments is strong and conducted with many cutting-edge tools and approaches. That said, the static images and the dynamic imaging fall short of being fully convincing that the small extracellular vesicles found in the neighboring extracellular matrix are indeed being deposited there via the smooth muscle cell filopodia. Many of the lines of evidence presented suggest that this could occur, but alternative hypotheses also exist that were not fully ruled out, such as the ECM-deposited vesicles were secreted more from the soma and/or the lamellipodia that are also emitted and retracted from the cells. In particular, the authors show very nice dynamic imaging (Supplementary Figure S2A and Supplemental Video S1) that is interpreted as "extracellular vesicles being released from the cell" and these are seen as "bursts" of fluorescent signal; however, none of these appear to occur in filopodia as they appear within the cell proper (a "burst" of signal vs. a more intense "streak" of signal), which would be a stronger and more consistent observation predicted by the working model proposed by the authors.

      Our live and fixed cell microscope data as well as inhibitor analysis showed that sEV secretion can be associated with the filopodia. However we agree with the reviewer that the data generated using pHluoron GFP marker clearly indicate that the majority of sEVs are secreted from the cell soma toward the ECM:

      To reflect this, we have added further changes:

      (1) Title: Matrix-associated extracellular vesicles modulate smooth muscle cell adhesion and directionality by presenting collagen VI.

      (2) Results, section title: 2. FN-induced sEV secretion is modulated by Arp2/3 and formin-dependent actin cytoskeleton remodelling

      (3)  Results, page 6 Line 27-36 “Formins and the Arp2/3 complex play a crucial role in the formation of filopodia, a cellular protrusion required for sensing the extracellular environment and cell-ECM interactions36. To test whether MVBs can be delivered to filopodia, we stained VSMCs for Myosin-10 (Myo10)37. We observed no difference between total filopodia number per cell on plastic or FN matrices (n=18±8 and n=14±3, respectively) however the presence of endogenous CD63+ MVBs along the Myo10-positive filopodia were observed in both conditions (Fig 2E, arrows). Filopodia have been implicated in sEV capture and delivery to endocytosis “hot-spots”38, so next we examined the directionality of CD63+ MVB movement in filopodia by overexpressing Myo10-GFP and CD63-RFP in live VSMCs. Importantly, we observed anterograde MVB transport toward the filopodia tip (Fig 2F and Supplementary Video S2) indicative of MVB secretion”.

      (4) Results, page 6, Ln 37-44 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)”.

      (5) Results, page 7 Ln 3 “Interestingly, CD63+ MVBs can be observed in filopodia-like structures suggesting that sEV secretion can also occur spatially via cellular protrusion-like filopodia but more studies are needed to confirm this hypothesis.”

      (6) Discussion, page 12, line 19. “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”

      Imaging of related human samples is certainly a strength of the paper, and the authors are commended for attempting to connect the findings from their cell culture experiments to an important clinical scenario. However, the marker selected for marking extracellular vesicles is CD81, which has been described as present on the endothelium of atherosclerotic plaques with a proposed role in the recruitment of monocytes into diseased arteries (Rohlena et al. Cardiovasc Res 2009). More data should address this potentially confounding interpretation of the signals presented in images within Figure 4.

      We thank the reviewer for this insightful comment that the  sEV marker CD81 can originate from endothelial cells in agreement with Rohlena et al., 2009.   To address this we investigated the spatial overlap between CD81 and the endothelial marker, CD31. We observed very strong CD81 staining in the intact endothelial cell (intima) layer and occasional CD31 positive cells in the neointima. Importantly, quantification of colocalization confirmed that 80% of CD81 in the neointima does not overlap with CD31 excluding an endothelial origin of these sEVs. (Fig 4G).  Moreover, we included complete morphological characterisation of the atherosclerotic plaques confirming that CD81 sEVs were primarily observed in the neointima where VSMCs constitute the cellular majority (Fig S4A, S4B, S4C and S4D).

      On a conceptual level, the idea that the small extracellular vesicles contain Type VI Collagen, and this element of their cargo is modulating smooth muscle cell migration, is an intriguing aspect of the authors' working model. Nevertheless, the evidence supporting this potential mechanism does not quite fit together as presented. It is not entirely clear how the collagen VI within the vesicles is somehow accessed by the smooth muscle cell filopodia during migration. Are the vesicles lysed open once on the extracellular matrix? If so, what is the proposed mechanism for that to occur? If not, how are the adhesion molecules on the smooth muscle cell surface engaging the collagen VI fibers that are contained within the vesicles? This aspect of the model does not quite fit together with the proposed mechanism and may be an interesting speculative interpretation, warranting further investigation, but it should not be considered a strong conclusion with sufficient convincing data supporting this idea.

      We thank the reviewer for their insightful comments regarding the mechanism by which collagen VI associated with sEVs could modulate smooth muscle cell adhesion and migration. To clarify, our new data suggest that collagen VI is predominantly present on the surface of the sEVs, as evidenced by Fig 7E. This surface localization strongly implies that collagen VI can be directly accessed by cell surface adhesion receptors, without the need for vesicle lysis or opening. While we cannot entirely rule out all alternative mechanisms, we consider vesicle rupture or lysis within the extracellular matrix to be a highly unlikely route for collagen VI exposure, given the known stability of sEVs under physiological conditions. We have added these points to clarify:

      (1) Results, page 10, Ln 45 “To confirm the presence of collagen VI in ECM-associated sEVs we analysed sEVs extracted from the 3D matrix using 0.5M NaCl treatment and showed that both collagen VI and FN are present (Fig 7D). Next, we analysed the distribution of collagen VI using dot-blot. Alix staining was bright only upon permeabilization of sEV indicating that it is preferentially a luminal protein (Fig 7E). On the contrary, CD63 staining was similar in both conditions showing that it is surface protein (Fig 7E). Interestingly, collagen VI staining revealed that 40% of the protein is located on the outside surface with 60% in the sEV lumen (Fig 7E).”

      (2) Discussion, page 13, Ln 2 “Hence, it will be interesting in future studies to investigate whether sEVs can stimulate Rho activity by presenting adhesion modulators—particularly collagen VI—on their surface, thereby guiding cell directionality during invasion..”

      (3) Discussion, page 14, Ln 30: In addition to collagen VI the unique adhesion cluster in VSMC-derived sEVS also includes EGF-like repeat and discoidin I-like domain-containing protein (EDIL3), transforming growth factor-beta-induced protein ig-h3 (TGFBI) and the lectin galactoside-binding soluble 3 binding protein (LGALS3BP) and these proteins are also directly implicated in activation of integrin signalling and cellular invasiveness85-87. Although we found that collagen VI plays the key role in sEV-induced early formation of FAs in VSMCs, it is tempting to speculate that the high sEV efficacy in stimulating FA formation is driven by cooperative action of this unique adhesion complex on the sEVs surface and targeting this novel sEV-dependent mechanism of VSMC invasion may open-up new therapeutic opportunities to modulate atherosclerotic plaque development or even to prevent undesired VSMC motility in restenosis.    .   

      (4) Abstract Figure

      On a technical level, some of the statistical analysis is not readily understood from the data presented. It is very much appreciated that the authors show many of the graphs with technical and biological replicate values in addition to the means and standard deviations (though this is not clearly stated in all figure legends). However, in figures such as Figure 5, there are bars shown and indicated to be different by statistical comparison (see panel B in Figure 5). It is not clear how the values for Group 1 (no FN, no 3-OMS, no sEV) are statistically different (denoted by three asterisks but no p value provided in the legend) than Group 3 (no FN, 3-OMS added, no sEV), when their means and standard deviations appear almost identical. If this is an oversight, this needs to be corrected. If this is truly the outcome, further explanation is warranted. A higher level of transparency in such instances would certainly go a long way in helping address the current crisis of mistrust within the scientific community and at the interface with society at-large.

      We thank the reviewer for their careful reading and important comments on the statistical analysis. We acknowledge that the technical and biological replicate data were not clearly reported in all figure legends and that the statistical approach for Figures 5A and 5B required clarification. In response, we have made several changes for greater transparency and rigor:

      First, we have now explicitly included the numbers of biological replicates (N) and technical replicates (n) in all relevant figure legends for Figures 1–7. In addition, the number of individual cell tracks is now annotated for the migration/invasion analyses, along with the mean values for each dataset.

      Upon review, we found that the original statistical analyses for Figures 5A and 5B were conducted using pooled averaged data. To address this, we have repeated the statistical tests using pooled individual cell track data, applying the Kruskal–Wallis test with Dunn’s multiple comparison correction. This more stringent approach revealed revised p-values, which are now indicated in Figures 5A and 5B.

      With these corrections, we reconfirm our major findings: In the 2D model, fibronectin (FN) coating promotes VSMC velocity, while inhibition of sEV secretion with 3-OMS leads to reduced cell speed (Fig. 5A). Addition of sEVs to the ECM had no effect on VSMC speed at baseline but did rescue cell speed and distance in the presence of 3-OMS, consistent with EVs acting primarily on invasion directionality rather than speed in both 2D and 3D models (Fig. 5A, 5D). Furthermore, sEVs continue to significantly impact VSMC invasion directionality (Figs. 5G, 5H), in agreement with previous reports in tumor cells (Sung et al., 2015).

      In summary, we have implemented the following revisions:

      (1) Figures 5A and 5B: Individual cell track data are now shown, and statistical analyses have been repeated using the Kruskal–Wallis test with Dunn’s multiple comparisons.

      (2) Figure legends and results sections: Numbers of biological and technical replicates, as well as individual data points, are now clearly stated.

      Results, page 9, line 14: The text has been updated to clarify the statistical approach and major findings as described above.

      We hope that these changes address the reviewer’s concerns and improve the transparency and reproducibility of our data presentation

      Reviewer #1 (Recommendations For The Authors):

      We are very thankful for the comprehensive review and comments which helped to improve our data.

      Figure 1.<br /> The authors clearly show that FN stimulation (immobilized or cell-derived) promotes sEV secretion via canonical integrin pathways. FN is a promigratory substrate, hence its extensive use as a cell adhesion aid; thus one could assume that simply plating on FN induces a pro-migratory phenotype (later data supports this notion). Does the addition of growth factors also increase sEV release? An endogenous function of FN is siloing of various GFs during clot formation. Also, FAK and SRC networks intersect with canonical RTK signaling in terms of promoting Rac1, CDC42 and other migration mediators. The reason I believe this is important is because the data could be interpreted in two ways: 1) FN induces pro-migration signaling and then sEVs are released, or visa versa, FN induces sEV release and migration is initiated. GF supplementation in the absence of FN would clarify this relationship.

      We thank the reviewer for this insightful comment regarding the possible role of growth factors (GFs) and the mechanistic relationship between FN stimulation, sEV secretion, and cell migration. We agree that FN is a well-established promoter of cell migration, and it is important to distinguish whether FN directly induces a pro-migratory phenotype or does so via sEV-mediated signaling.

      Our data show that FN stimulation markedly increases VSMC motility, as reflected by enhanced cell speed (Fig. 5A), an increased number of focal adhesions (Fig. 6E), and facilitated centripetal movement of FAs (Fig. 6F). Interestingly, ECM-associated sEVs appear to play a complementary but distinct role: they do not significantly affect cell migration speed (Fig. 5A) but instead guide cell invasion directionality (Figs. 5G, 5H), reduce the number of FAs per cell (Fig. 6E), and promote early peripheral FA formation (Fig. 6F). In light of these findings, we have updated our graphical abstract to reflect the unique cross-talk mediated by sEVs between VSMCs and the ECM.

      Regarding the influence of growth factors, we acknowledge that FN can bind and present different GFs, which could also contribute to changes in sEV secretion. Although our inhibition studies and integrin-blocking antibody results support a primary role for β1 integrin activation and actin assembly in triggering sEV secretion, we cannot entirely exclude the possibility that FN-bound growth factors play a role in this process. We have now incorporated this point into the discussion to address the reviewer’s suggestion.

      Discussion, page 14 , Ln 7 “Although our small inhibitors and integrin modulating antibody data clearly indicate that β1 activation triggers sEV secretion via activation of actin assembly we cannot fully rule out that FN may also be modulating growth factor activity which in turn contributes to sEV secretion by VSMCs<sup>23</sup>.  Excessive collagen and elastin matrix breakdown in atheroma has been tightly linked to acute coronary events hence it will be interesting to study the possible link between sEV secretion and plaque stability as sEV-dependent invasion is also likely to influence the necessary ECM degradation induced by invading cells<sup>96</sup>

      Figure 2.<br /> • The authors provide no evidence (or references) that SMIFH2 or CK666 halts filopodia extensions.

      Thank you for this important note. We have included the corresponding references:

      Results, page 5: “So next we tested the contribution of Arp2/3 and formins by using the small molecule inhibitors, CK666 and SMIFH2, respectively31, 32”.  

      • Is there an increase in filopodia density when plated on FN vs plastic? Similarly, if there are more filopodia present is that associated with more sEV? Please provide evidence in this regard.

      We agree that connecting the number of filopodia with the secretion level of sEVs may be an important clue if sEV secretion can be driven by FN-induced filopodia formation. However, Myosin10 staining to quantify filopodia (Fig 2E) showed no difference between VSMCs plated on plastic versus FN matrix. Therefore, we agree with the reviewer that the filopodia contribution to sEV secretion needs to be investigated further.  This idea is reflected in the following comments:

      (1) Results, page 6, Ln 29 “We observed no difference between total filopodia number per cell on plastic or FN matrices (n=18±8 and n=14±3, respectively) however the presence of endogenous CD63+ MVBs along the Myo10-positive filopodia were observed in both conditions (Fig 2E, arrows).

      (2) Results, page 6, Ln 37 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)..”

      (3) Discussion, page 12, Ln 15 : “Focal complexes either disassemble or mature into the elongated centripetally located FAs48. In turn, these mature FAs anchor the ECM to actin stress fibres and the traction force generated by actomyosin-mediated contractility pulls the FAs rearward and the cell body forward12, 13. Here we report that β1 integrin activation triggers sEV release followed by sEV entrapment by the ECM. Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells..”

      As hinted above, this data could be interpreted in the light of generally inhibiting cell migration to blunt sEV shedding. Does cell confluence affect sEV release? If cells are cultured to 100% confluency this would limit filopodia formation regardless of ECM type. If sEV secretion remains elevated on FN in this culture condition it would suggest a lack of dependency on filopodia.

      We thank the reviewer for this thoughtful suggestion regarding the influence of cell confluence on sEV release and filopodia formation. To directly address this hypothesis, we performed additional experiments comparing VSMCs cultured at low and high confluency. As described in the revised Results (page 7, line 39), we found that high cellular confluency reduced FN recycling, as indicated by the marked decrease in intracellular FN-positive spots and loss of colocalization with CD63 (Figs S3F, S3G). Importantly, this was accompanied by a significant reduction in CD63+/CD81+ sEV secretion by confluent cells (Fig S3H). These results suggest that VSMC confluence, which suppresses filopodia formation, nearly abolishes both intracellular FN trafficking and sEV secretion, even in the presence of FN. Thus, under our experimental conditions, sEV secretion by VSMCs appears to be closely linked to dynamic cell–matrix interactions and is dramatically reduced when these processes are limited by confluence:

      (1) Results, page 7, Ln 39 : “Notably, when we compared FN distribution in sparsely growing VSMCs versus confluent cells we found that FN intracellular spots, as well as colocalization with CD63, completely disappeared in the confluent state (Fig S3F and S3G). This correlated with nearly complete loss of CD63+/CD81+ sEV secretion by the confluent cells indicating that confluence abrogates intracellular FN trafficking as well as sEV secretion by VSMCs (Fig S3H)..  

      • Inhibition of branched actin polymerization has been shown to reduce both exocytic and endocytic activity. Thus, it is hard to interpret the results of Fig. 2B than anything more than a generalized effect of losing actin.

      We thank the reviewer for this important point regarding the broad cellular functions of branched actin polymerization, and agree that generalized actin loss can influence both exocytic and endocytic pathways. To address this, we performed additional experiments and analyses to better define the relationship between branched actin structures and sEV-related processes in VSMCs.

      As described in the revised Results (page 6), we overexpressed ARPC2-GFP (an Arp2/3 subunit) together with F-tractin-RFP in VSMCs and carried out live-cell imaging. This approach revealed that Arp2/3 and F-actin organize into lamellipodial scaffolds at the cell cortex, as expected (Fig. S2A; Supplementary Video S2). Additionally, and more unexpectedly, we observed numerous Arp2/3– and F-actin–positive dynamic spots within the VSMC cytoplasm. These structures resemble actin comet tails seen in other systems, previously implicated in endosomal propulsion (Fig. S2A, arrow; Supplementary Video S2).

      Quantitative analysis confirmed that a substantial fraction of these dynamic F-actin/cortactin spots colocalized with CD63+ endosomes (Fig. 2D), and that these structures are indeed branched actin tails based on cortactin immunostaining. Furthermore, inhibition of SMPD3 (with 3-OMS) induced enlarged cortactin/F-actin/CD63+ complexes, morphologically similar to invadopodia (Fig. 2D, arrowheads), supporting a functional link between actin branching and MVB dynamics.

      To quantify the association, we calculated Manders’ colocalization coefficients for F-actin tails and CD63+ endosomal structures in fixed VSMCs, observing that ~50% of F-actin tails were associated with ~13% of endosomes. Upon 3-OMS treatment, this overlap increased further (Fig. S2B).

      Finally, using live-cell imaging (Fig 2C; Supplementary Video S4), we directly observed CD63+ MVBs being propelled through the cytoplasm by Arp2/3-driven actin tails, suggesting a mechanistic role for branched actin assembly in MVB intracellular transport, rather than a generalized effect of actin disruption alone.

      We believe these combined data reinforce a more specific mechanistic role for Arp2/3-mediated branched actin in MVB/endosome transport and, consequently, in sEV secretion in VSMCs—over and above an indirect effect of global actin loss. We hope these additional experiments and quantitative analyses address the reviewer’s concern and clarify the functional relevance of branched actin structures to sEV trafficking:

      (1) Results, page 6, Ln 3 “As regulators of branched actin assembly, the Arp2/3 complex and cortactin are thought to contribute to sEV secretion in tumour cells by mediating MVB intracellular transport and plasma membrane docking[28, 33]. Therefore, we overexpressed the Arp2/3 subunit, ARPC2-GFP and the F-actin marker, F-tractin-RFP in VSMCs and performed live-cell imaging. As expected, Arp2/3 and F-actin bundles formed a distinct lamellipodia scaffold in the cellular cortex (Fig S2A and Supplementary Video S2). Unexpectedly, we also observed numerous  Arp2/3/F-actin positive spots moving  through the VSMC cytoplasm that resembled previously described endosome actin tails observed in Xenopus eggs[33] and parasite infected cells where actin comet tails propel parasites via filopodia to neighbouring cells[34, 35] (Fig S2A, arrow, and Supplementary Video S2). Analysis of the intracellular distribution of Arp2/3 and CD63-positive endosomes in VSMCs showed CD63-MVB propulsion by the F-actin tail in live cells (Fig 2C and Supplementary Video S4).”

      (2) Results, New data Fig 2D, page 6, Ln 14. “we observed numerous F-actin spots in fixed VSMCs that were positive both for F-actin and cortactin indicating that these are branched-actin tails (Fig 2D). Moreover, cortactin/F-actin spots colocalised with CD63+ endosomes and addition of the SMPD3 inhibitor, 3-OMS, induced the appearance of enlarged doughnut-like cortactin/F-actin/CD63 complexes resembling invadopodia-like structures similar to those observed in tumour cells (Fig 2D, arrowheads)[18].”

      (3) Results, New data Fig S2B, page 6, Ln 19 “To quantify CD63 overlap with the actin tail-like structures, we extracted round-shaped actin structures and calculated the thresholded Manders colocalization coefficient (Fig S2B).  We observed overlap between F-actin tails and CD63 as well as close proximity of these markers in fixed VSMCs (Fig S2B). Approximately 50% of the F-actin tails were associated with 13% of all endosomes (tM1=0.44±0.23 and tM2= 0.13±0.06, respectively, N=3). Addition of 3-OMS enhanced this overlap further (tM1=0.75±0.18 and tM2=0.25±0.09) suggesting that Arp2/3-driven branched F-actin tails are involved in CD63+ MVB intracellular transport in VSMCs”

      • In video 1 the author states (lines 8-9; pg6) "intense CD63 staining along filopodia" Although, there is some fluorescence (not strong) in these structures, there was no visible exocytic activity. This data is more suggestive that sEVs (marked by CD63) are not associated with filopodia. The following conclusion statement the authors make is overreaching given this result.

      We thank the reviewer for this careful observation and agree that the previous conclusion regarding sEV release from filopodia was overstated. In response, we have revised both the Results and Discussion sections to more accurately reflect the data..

      (1) Results, page 6, Ln37 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)..”

      (2) Discussion, page 12, Ln19 “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”. 

      • Fig 2D and video 2 are wholly unconvincing with regard to sEV secretion sites. The authors could use their CD63-pHluroin construct to count exocytic events in the filopodia vs the whole cell. Given the movie, I have a suspicion this would not be significant. The authors could also perform staining CD63 in non-permeabilized cells to capture and count exocytic events at the plasma membrane as well as their location between groups.

      We thank the reviewer for these constructive suggestions and their critical assessment of our current data regarding the sites of sEV secretion. We agree that our CD63-pHluorin approach clearly indicates sEV secretion events in the soma at the cell–ECM interface, while we did not observe comparable events in filopodia. Accordingly, we have clarified these points in the revised manuscript.

      (1) Results, page 6, Ln37 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)..”

      (2) Discussion, page 12, Ln19 “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”. 

      • Fig. 2E and video 4. Again, the conclusions drawn from this data are very strained. First, no co-localization quantification is presented on the proportion of CD63 vesicles with actin. Once again, the movie, if anything convinces the reader that 95-99% of all CD63 vesicles are not associated with actin; therefore, this is an unlikely mechanism of transport.

      We thank the reviewer for this valuable comment and for highlighting the need for quantitative co-localization analysis. In response, we developed a method to systematically quantify F-actin and CD63 co-localization in fixed VSMCs, as now presented in new Figures 2D and S2B. We acknowledge that the majority of CD63+ endosomes are not associated with F-actin, consistent with the reviewer’s interpretation. However, our quantitative data now show that a specific subpopulation of MVBs appears to utilize this actin-based mechanism for transport. We believe this addresses the concern and more accurately reflects the prevalence and significance of the mechanism described.

      (1) Results, page 6 , Ln 19. “To quantify CD63 overlap with the actin tail-like structures, we extracted round-shaped actin structures and calculated the thresholded Manders colocalization coefficient (Fig S2B).  We observed overlap between F-actin tails and CD63 as well as close proximity of these markers in fixed VSMCs (Fig S2B). Approximately 50% of the F-actin tails were associated with 13% of all endosomes (tM1=0.44±0.23 and tM2= 0.13±0.06, respectively, N=3). Addition of 3-OMS enhanced this overlap further (tM1=0.75+/-0.18 and tM2=0.25+/-0.09) suggesting that Arp2/3-driven branched F-actin tails are involved in CD63+ MVB intracellular transport in VSMCs.”

      • Are there perturbations that increase filopodia numbers? A gain of function experiment would be valuable here.

      We thank the reviewer for this important suggestion regarding the potential value of gain-of-function experiments to clarify filopodia’s contribution to sEV release. In agreement with the reviewer’s scepticism, we have removed statements linking filopodia to sEV release from both the title and abstract to avoid overinterpretation. At present, our understanding of filopodia biology and the lack of robust tools to selectively and substantially increase filopodia numbers in VSMCs prevent us from directly addressing this question through gain-of-function assays. We acknowledge that future studies using established methods—such as overexpression of filopodia-inducing proteins (e.g., mDia2 or fascin)—could provide insight into whether an increased number of filopodia affects sEV release. However, such experiments are beyond the scope of the current manuscript. We have made the following changes to clarify these points:

      (1) Results, page 6, Ln37 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)..”

      (2) Discussion, page 12, Ln19 “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”. 

      Figure 3<br /> • Fig 3A. The CD63 staining is strongly associated with the entire plasma membrane. How are the authors distinguishing between normal membrane shedding and bona fida sEVs based on this staining alone (?)- this is insufficient as all membrane structures are seemingly positive. Additionally, there are very few sEVs in scrutinizing the provided images. For the "sEV secretion, fold change" graphs in previous figures, could the authors provide absolute values, or an indication of what these values are in absolute terms?

      We thank the reviewer for raising this important point regarding the specificity of CD63 staining and the need to distinguish bona fide sEVs from membrane fragments or general membrane shedding. We agree that CD63 staining alone at the plasma membrane or in the extracellular matrix is not sufficient to unequivocally identify sEVs. To address this, we employed several complementary approaches to rigorously characterize ECM-associated sEVs:

      First, using high-resolution iSIM imaging, we confirmed the association of CD63-positive particles specifically with the FN-rich matrix, and demonstrated that SMPD3 knockdown significantly reduced the number of CD63+ particles in the matrix (Fig. 3B; revised from Fig. S3A).

      Second, by incubating FN matrices with purified and fluorescently labeled sEVs, we directly observed efficient entrapment of these labeled sEVs within the matrices (Fig. 3E), confirming that sEVs can interact with and be retained by the ECM.

      Third, we developed and applied a sequential extraction protocol using mild salt buffer (0.5M NaCl) and strong denaturant (4M guanidine HCl) to selectively extract ECM-associated sEVs based on the strength of their association (see new Figs. S3A and S3B). Extracted vesicles were then characterized by ExoView analysis, which demonstrated a tetraspanin profile (CD63+/CD81+/CD9+) closely matching that of sEVs from conditioned media, providing evidence that these particles are true sEVs and not merely membrane debris. We also found that the more weakly bound (NaCl-extracted) fraction closely resembles media-derived sEVs, while the strongly bound (GuHCl-extracted) fraction is more enriched in CD63+ and CD63+/CD81+ sEVs but contains very few CD9+ vesicles, further supporting distinct extracellular vesicle subpopulations within the ECM.

      In addition, the abundance of CD63+/CD81+ sEVs in both media and ECM-derived fractions was independently validated by CD63 bead-capture assay (Fig. S3B).

      We hope these clarifications and the expanded data set address the reviewer’s concerns about sEV identification and quantification in the extracellular matrix:

      (1) Results, page 7, Ln 16. To quantify ECM-trapped sEVs we applied a modified protocol for the sequential extraction of extracellular proteins using salt buffer (0.5M NaCl) to release sEVs which are loosely-attached to ECM via ionic interactions, followed by 4M guanidine HCl buffer (GuHCl) treatment to solubilize strongly-bound sEVs (Fig S3A) 42. We quantified total sEV and characterised the sEV tetraspanin profile in conditioned media, and the 0.5M NaCl and GuHCl fractions using ExoView. The total particle count showed that EVs are both loosely bound and strongly trapped within the ECM. sEV tetraspanin profiling showed differences between these 3 EV populations.  While there was close similarity between the conditioned media and the 0.5M NaCl fraction with high abundance of CD63+/CD81+ sEVs as well as CD63+/CD81+/CD9+ in both fractions (Fig S3A). In contrast, the GuHCl fraction was particularly enriched with CD63+ and CD63+/CD81+ sEVs with very low abundance of CD9+ EVs (Fig S3A). The abundance of CD63+/CD81+ sEVs was confirmed independently by a CD63+ bead capture assay in the media and loosely bound fractions (Fig S3B).

      • A control of fig 3b would be helpful to parse out random uptake of extracellular debris verses targeted sEV internalization. It would be helpful if the authors added particles of similar size to that of the sEVs to test whether these structures are endocytosed/micropinocytosed at similar levels.

      We thank the reviewer for this useful suggestion regarding the need for better controls to distinguish specific sEV uptake from nonspecific internalization of extracellular debris or similarly sized particles. As a comparison, in our study we analyzed the uptake of both sEVs and serum proteins such as fibronectin and fetuin-A (Figs S3C and S3D), and observed similar patterns of intracellular trafficking. However, we acknowledge that inert nanoparticles or beads of a similar size to sEVs could serve as potential controls to assess nonspecific micropinocytosis or endocytosis.

      It is important to note, however, that the uptake of sEVs is strongly influenced by their surface protein composition and the so-called “protein corona.” Recent work from Prof. Khuloud T. Al-Jamal’s group underscores that exosome uptake mechanisms may be highly specific (Liam-Or et al., 2024), and studies from Mattias Belting’s lab have also shown the importance of heparan sulfate proteoglycans in exosome endocytosis (Cerezo-Magana et al., 2021). As a result, uptake comparisons with inert particles or beads may not fully recapitulate the specificity of sEV internalization, and distinct nanoparticle classes may rely on different uptake pathways.

      Figure 4<br /> • Fig. 4E,F,G. How are the authors determining the neointima and media compartments without ancillary staining for basement membrane or endothelial markers? Anatomic specific markers need to be incorporated here for the reader to evaluate the specificity of the FN and CD81 staining. It is also hard to understand the severity of the atherosclerotic lesion without a companion H&E cross section.

      We thank the reviewer for highlighting the need for more rigorous characterization of atherosclerotic lesion architecture and anatomical compartments in our study. In response, we have incorporated additional histological analyses and now provide ancillary staining and companion images to enable clear identification of the neointima and medial compartments, as well as to assess lesion severity (see new Figs S4A–S4D):

      (1)Results, page  8, Ln 28. . “To test if FN associates with sEV markers in atherosclerosis, we investigated the spatial association of FN with sEV markers using the sEV-specific marker CD81. Staining of atherosclerotic plaques with haematoxylin and eosin revealed well-defined regions with the neointima as well as tunica media layers formed by phenotypically transitioned or contractile VSMCs, respectively (Fig S4A). Masson's trichrome staining of atherosclerotic plaques showed abundant haemorrhages in the neointima, and sporadic haemorrhages in the tunica media (Fig S4B). Staining of atherosclerotic plaques with orcein indicated weak connective tissue staining in the atheroma with a confluent extracellular lipid core, and strong specific staining at the tunica media containing elastic fibres which correlated well with the intact elastin fibrils in the tunica media (Figs S4C and S4D). Using this clear morphological demarcation, we found that FN accumulated both in the neointima and the tunica media where it was significantly colocalised with the sEV marker, CD81 (Fig. 4D, 4E and 4F). Notably CD81 and FN colocalization was particularly prominent in cell-free, matrix-rich plaque regions (Figs. 4E and 4F).”

      • Figs s4c, S4d- proper controls are not provided. Again, a non-FN internalization control as well as a 4oC cold block negative control is required to interpret this data.

      We thank the reviewer for this valuable suggestion. To enhance the rigor of our internalization assays, we have now included several additional controls using alternative treatments, fluorophore combinations, and internalization conditions:

      a) We performed FN-Alexa568 uptake assays, followed by immunostaining for CD63 with a distinct fluorophore (Alexa488), to confirm the colocalization of internalized FN with CD63+ endosomal compartments in VSMCs (new Fig. S3E).

      b) We also stained VSMCs, cultured under normal growth conditions, with an anti-FN antibody to visualize intracellular serum-derived FN and again observed colocalization with CD63 (new Figs. S3F and S3G). Notably, in cells grown to confluence, we observed a complete loss of intracellular FN staining and FN/CD63 colocalization, suggesting that FN recycling is prominent in sparse, motile cells, but not in confluent populations.

      These additional controls strengthen our conclusions regarding FN internalization pathways and the conditions under which FN trafficking to the endosomal system occurs:

      (1) Results, page 7, Ln 31  We treated serum-deprived primary human aortic VSMCs with FN-Alexa568 and found that it was endocytosed and subsequently delivered to early and late endosomes together with fetuin A, another abundant serum protein that is a recycled sEV cargo and elevated in plaques (Figs S3C and S3D). CD63 visualisation with a different fluorophore (Alexa488) confirmed FN colocalization with CD63+ MVBs (Fig S3E). Next, we stained non-serum deprived VSMC cultured in normal growth media (RPMI supplemented with 20% FBS) with an anti-FN antibody and observed colocalization of CD63 and serum-derived FN.  Co-localisation was reduced likely due to competitive bulk protein uptake by non-deprived cells (Fig S3F). Notably, when we compared FN distribution in sparsely growing VSMCs versus confluent cells we found that FN intracellular spots, as well as colocalization with CD63, completely disappeared in the confluent state (Fig S3F and S3G)..

      • Can the authors please provide live and fixed imaging of FN and CD63-mediate filopodial secretion to amply support their conclusions.

      We have observed CD63 MVBs in both fixed (Fig 2E) and live VSMCs (Fig 2F) yet we agree that further studies are required to establish the contribution of filopodia to sEV secretion. Therefore, we have added the following changes:

      (1) Results, page 6, Ln37 “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)..”

      (2) Discussion, page 12, Ln19 “Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells.”. 

      Figure 5

      • Fig. 5A,B. The authors claim that sEV supplementation enhances VSMC migration speed and distance. The provided graphs show only a marginal increase in speed with sEV addition (A) but, concerningly, there is a four-star significant difference between the FN condition compared with FN+sEV (B) while the means appear the same. How are these conditions statistically different? The statistics seem off for these comparisons.

      We thank the reviewer for highlighting concerns regarding the statistical analysis in Figures 5A and 5B. In response, we have carefully re-examined our data and statistical approach to ensure accuracy and transparency.

      First, we have now included all individual cell migration tracks in the data representation for these figures. The statistical tests were repeated using the Kruskal–Wallis test with Dunn’s multiple comparison correction across all groups. This more stringent analysis confirmed our key findings: fibronectin (FN) stimulates VSMC migration speed, while inhibition of sEV secretion (with 3-OMS) reduces cellular speed (Fig. 5A). Addition of exogenous ECM-associated sEVs modestly restored cell speed in the presence of 3-OMS, but had no effect on baseline migration speed in 2D or 3D models (Figs. 5A, 5D).

      Regarding the four-star significance observed in the original Fig. 5B, the previous result reflected an analysis based on pooled group averages, which may have overstated marginal differences. The revised analysis, based on individual cell tracks, does not support a substantial difference between FN and FN+sEV groups. The revised p-values and comparisons are now provided directly on the figures and described in the figure legends. We also clearly report the numbers of biological replicates, technical replicates, and individual data points for every condition.

      Further, the modest effect of ECM-associated sEVs on speed is consistent with our observation that sEVs influence invasion directionality rather than baseline migration velocity, in agreement with previous findings in tumor models (Sung et al., 2015).

      The manuscript has been revised accordingly, with updates in:

      (1) Figures 5A and 5B: Individual cell track data are now shown, and statistical analyses have been repeated using the Kruskal–Wallis test with Dunn’s multiple comparisons.

      (2) Figure legends and results sections: Numbers of biological and technical replicates, as well as individual data points, are now clearly stated.

      (3) Results, page 9, line 14:  “FN as a cargo in sEVs promotes FA formation in tumour cells and increases cell speed14, 15. As we found that FN is loaded into VSMC-derived sEVs we hypothesized that ECM-entrapped sEVs can enhance cell migration by increasing cell adhesion and FA formation in the context of a FN-rich ECM. Therefore, we tested the effect of sEV deposition onto the FN matrix on VSMC migration in 2D and 3D models. We found that FN coating promoted VSMC velocity and inhibition of bulk sEV secretion with 3-OMS reduced VSMC speed in a 2D single-cell migration model (Figs. 5A, 5B) in agreement with previous studies using tumour cells14, 15. However, addition of sEVs to the ECM had no effect on VSMC speed at baseline but rescued cell speed and distance in the presence of the sEV secretion inhibitor, 3-OMS suggesting the EVs are not primarily regulating cell speed (Figs 5A and 5B).”

      (4) Results, page 9, Ln 29 “Hence, ECM-associated sEVs have modest influence on VSMC speed but influence VSMC invasion directionality.”.

      We hope that these changes address the reviewer’s concerns and improve the transparency and reproducibility of our data presentation

      • Fig d-h. Generally, the magnitude of the difference between the presented conditions are biologically insignificant. Several of the graphs show a four-star difference with means that appear equivalent with overlapping error bars. Do the authors conclude that a 0.1%, or less, effect between groups is biologically meaningful?

      We thank the reviewer for drawing attention to the apparent mismatch between statistical significance and biological relevance in Figures 5d–h. In response, we have reanalyzed the data using individual cell tracks and more stringent non-parametric statistical tests, as described above. This reanalysis confirmed that the magnitude of differences in migration speed and related parameters between the groups is minimal and not biologically meaningful. Thus, we no longer claim that sEVs significantly affect VSMC migration speed under these conditions in either 2D or 3D assays. Our revised manuscript now accurately reflects this finding in both the Results and Discussion sections, and the updated figures and legends clarify the true extent of any differences observed.

      Figure 6

      • Generally, the author's logic for looking into adhesion, focal adhesion and traction forces is hard to follow. If there are sEV-mediated migration differences, then there would inexorably be focal adhesion alterations. However, the data indicates few differences brought on by sEVs, which speaks to the lack of migration differences presented in Fig. 5. Overall, the sEV migration phenotype has so little of an effect, to then search for a mechanism seems destine to not turn up anything significant.

      We thank the reviewer for highlighting the importance of connecting the observed phenotypic effects of sEVs to the investigation of adhesion and focal adhesion mechanisms. While our revised analysis confirms that sEVs have little to no effect on VSMC migration speed or distance in 2D and 3D models, we did observe a robust effect of sEVs on the directionality of cell invasion (Figs. 5G and 5H). This prompted us to look more closely at pathways involved in cell guidance rather than bulk cell motility.

      Our proteomic comparison between larger EVs (10K fraction) and sEVs (100K fraction) revealed a unique adhesion complex present specifically on the sEVs—comprising collagen VI, TGFBI, LGALS3BP, and EDIL3 (Figs. 7A–C)—each of which has previously been implicated in integrin signaling, cell adhesion, or invasion. Functional blocking and knockdown studies further identified collagen VI as a key mediator in the regulation of cell adhesion and invasion directionality influenced by sEVs (Figs. 7F and 7I).

      In response to this mechanistic insight, we have modified the graphical abstract and discussion to clarify our approach:

      We now explicitly state that our focus has shifted from analyzing baseline migration speed to mechanisms guiding invasion directionality, in line with our key phenotypic findings.We highlight that the unique adhesion cluster identified on sEVs—including collagen VI and its cooperative partners—provides a strong mechanistic rationale for examining focal adhesion dynamics and ECM interactions, even in the absence of changes in migration velocity.Discussion excerpts (pages 13–14) have been updated to reflect this rationale and to summarize the potential significance of these findings for vascular biology and disease.

      We hope this clarifies the logic underlying our approach and justifies the mechanistic studies performed in this context:

      (1) Discussion, page 13, Ln 2  “Hence, it will be interesting in future studies to investigate whether sEVs can stimulate Rho activity by presenting adhesion modulators—particularly collagen VI—on their surface, thereby guiding cell directionality during invasion.”

      (2) Discussion, page 13, Ln 30  “In addition to collagen VI the unique adhesion cluster in VSMC-derived sEVS also includes EGF-like repeat and discoidin I-like domain-containing protein (EDIL3), transforming growth factor-beta-induced protein ig-h3 (TGFBI) and the lectin galactoside-binding soluble 3 binding protein (LGALS3BP) and these proteins are also directly implicated in activation of integrin signalling and cellular invasiveness85-87. Although we found that collagen VI plays the key role in sEV-induced early formation of FAs in VSMCs, it is tempting to speculate that the high sEV efficacy in stimulating FA formation is driven by cooperative action of this unique adhesion complex on the sEVs surface and targeting this novel sEV-dependent mechanism of VSMC invasion may open-up new therapeutic opportunities to modulate atherosclerotic plaque development or even to prevent undesired VSMC motility in restenosis”.    . 

      (3) Discussion, page 14, Ln 14 “In summary, cooperative activation of integrin signalling and F-actin cytoskeleton pathways results in the secretion of sEVs which associate with the ECM and play a signalling role by controlling FA formation and cell-ECM crosstalk. Further studies are needed to test these mechanisms across various cell types and ECM matrices.     ”.    

      Figure 7<br /> • The authors need to provide additional evidence Col IV is harbored in sEVs and not a contaminant of sEV isolation as VSMCs secrete a copious amount of this in culture. For instance, IHC of isolated sEVs stained for CD63 and Col IV as well as single cell staining of the same sort.

      We thank the reviewer for this important comment regarding the specificity of collagen VI detection in sEVs. To ensure that collagen VI is associated with bona fide sEVs—rather than being a contaminant resulting from high extracellular abundance—we performed a comparative analysis of vesicles isolated from the same conditioned media. Both proteomic mass spectrometry and western blotting revealed that collagen VI was exclusively present in the small EV (100K pellet) fraction and not in the larger EVs (10K pellet), as shown in Figs. 7B and 7C. Collagen VI was further identified in sEVs extracted from the ECM using our salt/guanidine protocol (new Fig. 7D).

      Reviewer #2 (Recommendations For The Authors):

      The authors have presented a nice collection of data with strong approaches to address their hypotheses. Nevertheless, an additional section within the Discussion would be welcome in addressing the potential limitations and important caveats to be considered alongside their study. These caveats and limitations could be reshaped by additional data supporting the ideas that: (1) small extracellular vesicles can be directly observed during their secretion from filopodia, (2) CD81 labeling in tissue can be interpreted clearly as extracellular vesicles and not the cell surface of other cell types (co-staining with an endothelial cell marker such as PECAM-1 perhaps), and (3) collagen VI within the vesicles is somehow accessed by adhesion molecules on the cell surface of migrating cells.

      We thank the reviewer for these important suggestions and we have now added further studies and modified our conclusions to reflect the data more accurately:

      (1) Results. Page 6, Ln37  “We also attempted to visualise sEV release in filopodia using CD63-pHluorin where fluorescence is only observed upon the fusion of MVBs with the plasma membrane39. Using total internal reflection fluorescence microscopy (TIRF) we observed the typical “burst”-like appearance of sEV secretion at the cell-ECM interface in full agreement with an earlier report showing MVB recruitment to invadopodia-like structures in tumor cells18 (Fig S2B and Supplementary Video S1). Although we also observed an intense CD63-pHluorin staining along filopodia-like structures we were not able to detect typical “burst”-like events to confirm sEV secretion in filopodia. (Fig S2C and Supplemental Video S1)”..  

      (2) Discussion, page 12, Ln18: “Here we report that β1 integrin activation triggers sEV release followed by sEV entrapment by the ECM. Curiously we observed CD63+ MVB transport toward the filopodia tips as well as inhibition of sEV-secretion with filopodia formation inhibitors suggesting that sEV secretion can be directly linked to filopodia but further studies are needed to define the contribution of this pathway to the overall sEV secretion by cells”..

      We quantified the colocalization of CD81 and CD31 to exclude the endothelial cell origin of sEVs and extended the characterisation of the atherosclerotic matrix as well as highlighting any limitations to interpretation ie re  CD81 ECM localisation: 

      (1) Results, page 8, Ln 43 “An enhanced expression of CD81 by endothelial cells in early atheroma has been previously reported so to study the contribution of CD81+ sEVs derived from endothelial cells  we investigated the localisation of CD31 and CD8145. In agreement with a previous study, we found that the majority of CD31 colocalises with CD81 (Thresholded Mander's split colocalization coefficient 0.54±0.11, N=6) indicating that endothelial cells express CD81 (Fig 4G)45. However, only a minor fraction of total CD81 colocalised with CD31 (Thresholded Mander's split colocalization coefficient 0.24±0.06, N=6) confirming that the majority of CD81 in the neointima is originating from the most abundant VSMCs.. 

      (2) Results, page 8, Ln 28: “To test if FN associates with sEV markers in atherosclerosis, we investigated the spatial association of FN with sEV markers using the sEV-specific marker CD81. Staining of atherosclerotic plaques with haematoxylin and eosin revealed well-defined regions with the neointima as well as tunica media layers formed by phenotypically transitioned or contractile VSMCs, respectively (Fig S4A). Masson's trichrome staining of atherosclerotic plaques showed abundant haemorrhages in the neointima, and sporadic haemorrhages in the tunica media (Fig S4B). Staining of atherosclerotic plaques with orcein indicated weak connective tissue staining in the atheroma with a confluent extracellular lipid core, and strong specific staining at the tunica media containing elastic fibres which correlated well with the intact elastin fibrils in the tunica media (Figs S4C and S4D). Using this clear morphological demarcation, we found that FN accumulated both in the neointima and the tunica media where it was significantly colocalised with the sEV marker, CD81 (Fig. 4D, 4E and 4F). Notably CD81 and FN colocalization was particularly prominent in cell-free, matrix-rich plaque regions (Figs. 4E and 4F). .”

      We showed that collagen VI is presented on the surface of sEVs:

      (1) Results, page 10, Ln43: “Collagen VI was the most abundant protein in VSMC-derived sEVs (Fig 7B, Table S7) and  was previously implicated in the interaction with the proteoglycan NG253 and suppression of cell spreading on FN54. To confirm the presence of collagen VI in ECM-associated sEVs we analysed sEVs extracted from the 3D matrix using 0.5M NaCl treatment and showed that both collagen VI and FN are present (Fig 7D). Next, we analysed the distribution of collagen VI using dot-blot. Alix staining was bright only upon permeabilization of sEV indicating that it is preferentially a luminal protein (Fig 7E). On the contrary, CD63 staining was similar in both conditions showing that it is surface protein (Fig 7E). Interestingly, collagen VI staining revealed that 40% of the protein is located on the outside surface with 60% in the sEV lumen (Fig 7E)

    1. Reviewer #1 (Public review):

      Summary:

      This study examines how two common psychiatric treatments, antidepressant medication and cognitive distancing, influence baseline levels and moment-to-moment changes in happiness, confidence, and engagement during a reinforcement learning task. Combining a probabilistic selection task, trial-by-trial affect ratings, psychiatric questionnaires, and computational modeling, the authors demonstrate that each treatment has distinct effects on affective dynamics. Notably, the results highlight the key role of affective biases in how people with mental health conditions experience and update their feelings over time, and suggest that interventions like cognitive distancing and antidepressant medication may work, at least in part, by shifting these biases.

      Strengths:

      (1) Addresses an important question: how common psychiatric treatments impact affective biases, with potential translational relevance for understanding and improving mental health interventions.

      (2) The introduction is strong, clear, and accessible, making the study approachable for readers less familiar with the underlying literature.

      (3) Utilizes a large sample that is broadly representative of the UK population in terms of age and psychiatric symptom history, enhancing generalizability.

      (4) Employs a theory-driven computational modeling framework that links learning processes with subjective emotional experiences.

      (5) Uses cross-validation to support the robustness and generalizability of model comparisons and findings.

      Weaknesses:

      The authors acknowledge the limitations in the discussion section.

      Additional questions:

      (1) Group Balance & Screening for Medication Use: How many participants in the cognitive distancing and control groups were taking antidepressant medication? Why wasn't medication use included as part of the screening to ensure both groups had a similar number of participants taking medication?

      (2) Assessment of the Practice of Cognitive Distancing: Is there a direct or more objective method to evaluate whether participants actively engaged in cognitive distancing during the task, and to what extent? Currently, the study infers engagement indirectly through the outcomes, but does not include explicit measures of participants' use of the technique. Would including self-report check-ins throughout the task, asking participants whether they were actively engaging in cognitive distancing, have been useful? However, including frequent self-report check-ins would increase procedural differences between groups, making perhaps the tasks less comparable beyond the intended treatment manipulation. Maybe incorporating a question at the end of the task, asking how much they engaged in cognitive distancing, could offer a useful measure of subjective engagement without overly disrupting the task flow.

      Conclusion:

      This study advances our understanding of the mechanisms underlying mental health interventions. The combination of computational modeling with behavioral and affective data offers a powerful framework for understanding how treatments influence affective biases and dynamics. These findings are of broad interest across clinical and mental health sciences, cognitive and affective research, and applied translational fields focused on improving psychological well-being.

    2. Reviewer #3 (Public review):

      Summary:

      The present manuscript investigates and proposes different mechanisms for the effects of two therapeutic approaches - cognitive distancing technique and use of antidepressants - on subjective ratings of happiness, confidence, and task engagement, and on the influence of such subjective experiences on choice behavior. Both approaches were found to link to changes in affective state dynamics in a choice task, specifically reduced drift (cognitive distancing) and increased baseline (antidepressant use). Results also suggest that cognitive distancing may reduce the weighing of recent expected values in the happiness model, while antidepressant use may reduce forgetting of choices and outcomes.

      Strengths:

      This is a timely topic and a significant contribution to ongoing efforts to improve our mechanistic understanding of psychopathology and devise effective novel interventions. The relevance of the manuscript's central question is clear, and the links to previous literature and the broader field of computational psychiatry are well established. The modelling approaches are thoughtful and rigorously tested, with appropriate model checks and persuasive evidence that modelling complements the theoretical argument and empirical findings.

      Weaknesses:

      Some vagueness and lack of clarity in theoretical mechanisms and interpretation of results leave outstanding questions regarding (a) the specific links drawn between affective biases, therapies aimed at mitigating them, and mental health function, and (b) the structure and assumptions of the modelling, and how they support the manuscript's central claims. Broadly, I do not fully understand the distinction between how choice behavior vs. affect are impacted separately or together by cognitive distancing. Clarification on this point is needed, possibly through a more explicit proposal of a mechanism (or several alternative mechanisms?) in the introduction and more explicit interpretation of the modelling results in the context of the cyclical choice-affect mechanism.

      (1) Theoretical framework and proposed mechanisms

      The link between affective biases and negative thinking patterns is a bit unclear. The authors seem to make a causal claim that "affective biases are precipitated and maintained by negative thinking patterns", but it is unclear what precisely these negative patterns are; earlier in the same paragraph, they state that affective biases "cause low mood" and possibly shift choices toward those that maintain low mood. So the directionality of the mechanism here is unclear - possibly explaining a bit more of the cyclic nature of this mechanism, and maybe clarifying what "negative thinking patterns" refer to will be helpful.

      More generally, this link between affect and choices, especially given the modelling results later on, should be clarified further. What is the mechanism by which these two impact each other? How do the models of choice and affect ratings in the RL task test this mechanism? I'm not quite sure the paper answers these questions clearly right now.

      The authors also seem to implicitly make the claim that symptoms of mental ill-health are at least in part related to choice behavior. I find this a persuasive claim generally; however, it is understated and undersupported in the introduction, to the point where a reader may need to rely on significant prior knowledge to understand why mitigating the impact of affective biases on choice behavior would make sense as the target of therapeutic interventions. This is a core tenet of the paper, and it would be beneficial to clarify this earlier on.

      It would be helpful to interpret a bit more clearly the findings from 3.4. on decreased drift in all three subjective assessments in the cognitive distancing group. What is the proposed mechanism for this? The discussion mentions that "attenuated declines [...] over time, [add] to our previously reported findings that this psychotherapeutic technique alters aspects of reward learning" - but this is vague and I do not understand, if an explanation for how this happens is offered, what that explanation is. Given the strong correlation of the drift with fatigue, is the explanation that cognitive distancing mitigates affect drift under fatigue? Or is this merely reporting the result without an interpretation around potential mechanisms?

      (Relatedly, aside from possibly explaining the drift parameter, do the fatigue ratings link with choice behavior in any way? Is it possible that the cognitive distancing was helping participants improve choices under fatigue?)

      (2) Task Structure and Modelling

      It is unclear what counted as a "rewarding" vs. "unrewarding" trial in the model. From my understanding of the task description, participants obtained positive or no reward (no losses), and verbal feedback, Correct/Incorrect. But given the probabilistic nature of the task, it follows that even some correct choices likely had unrewarding results. Was the verbal feedback still "Correct" in those cases, but with no points shown? I did not see any discussion on whether it is the #points earned or the verbal feedback that is considered a reward in the model. I am assuming the former, but based on previous literature, likely both play a role; so it would be interesting - and possibly necessary to strengthen the paper's argument - to see a model that assigns value to positive/negative feedback and earned points separately.

      From a theory perspective, it's interesting that the authors chose to assume separate learning rates for rewarding and non-rewarding trials. Why not, for example, separate reward sensitivity parameters? E.g., rather than a scaling parameter on the PE, a parameter modifying the r term inside the PE equation to, perhaps, assign different values to positive and zero points? (While I think overall the math works out similarly at the fitting time, this type of model should be less flexible on scaling the expected value and more flexible on scaling the actual #points / the subjective experience of the obtained verbal feedback, which seems more in line with the theoretical argument made in the introduction). The introduction explicitly states that negative biases "may cause low mood by making outcomes appear less rewarding" - which in modelling equations seems more likely to translate to different reward-perception biases, and not different learning rates. Alternatively, one might incorporate a perseveration parameter (e.g., similar to Collins et al. 2014) that would also accomplish a negative bias. Either of these two mechanisms seems perhaps worth testing out in a model - especially in a model that defines more clearly what rewarding vs. unrewarding may mean to the participant.

      If I understand correctly, the affect ratings models assume that the Q-value and the PE independently impact rating (so they have different weights, w2 and w3), but there is no parameter allowing for different impact for perceived rewarding and unrewarding outcomes? (I may be misreading equations 4-5, but if not, Q-value and PE impact the model via static rather than dynamic parameters.) Given the joint RL-affect fit, this seems to carry the assumption that any perceptual processing differences leading to different subjective perceptions of reward associated with each outcome only impact choice behavior, but not affect? (whereas affect is more broadly impacted, if I'm understanding this correctly, just by the magnitude of the values and PEs?) This is an interesting assumption, and the authors seem to have tested it a bit more in the Supplementary material, as shown in Figure S4. I'm wondering why this was excluded from the main text - it seems like the more flexible model found some potentially interesting differences which may be worth including, especially as they might shed additional insight into the influence of cognitive distancing on the cyclical choice-affect mechanisms proposed.

      Minor comments:

      If fatigue ratings were strongly associated with drift in the best-fitting model (as per page 13), I wonder if it would make sense to use those fatigue ratings as a proxy rather than allow the parameter to vary freely? (This does not in any way detract from the winning model's explanatory power, but if a parameter seems to be strongly explained by a variable we have empirical data for, it's not clear what extra benefit is earned by having that parameter in the model).

    1. Week 14: PEER-TO-PEER WRITING (Thursday, December 4, 2025)

      peer-to-peer writing

      If only we could have that in an atuonomous setting that would be the right counterballance

      Gives a good target date

    1. In 1984 the main character, Winston Smith, works in the government’s “Ministry of Truth.” His Job is to rewrite news stories to be consistent with the way in which the government wants you to think. George Orwell uses his concept of Newspeak, an earlier essay, that argues that to control what people think, control their language and only those thoughts consistent with that language will occur. 4

      Newspeak- a way to strip down language and limit it for example- if no one knew what rebellion meant then they how could you understand a actual rebellion happeneing

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) Given the mechanical nature of the device and the propensity for mice to urinate on things, I also wonder how frequently the device breaks/needs to be repaired. Perhaps some details regarding the cost and reliability of the device would be helpful to include, as these are the two things that could make researchers hesitant to adopt immediately.

      We thank the reviewer for their astute observations. We also noted the problem of mouse waste and incorporated this concern into the redesign we mention in the text.

      “Mouse waste getting on mechanical parts was found to be a major concern for the initial version of the device. As part of the redesign, the linear stages were moved out from under the mice to avoid this problem. Despite this problem, the original version of the device has not had any of its stages break down yet. A common problem though was that stimulus tips would blunt or break if they hit the mesh of the mesh table, requiring replacement. This has been solved in the latest version through a new feature where the mesh is detected via the force sensor, prompting immediate stimulus withdrawal, avoiding damage.”

      In regards to cost and adoption, we have added this sentence to the final line of the discussion:

      “To promote wide adaptation of this device across as many labs as possible, a company, Tactorum Inc., has been formed.”

      (2) The only major technical concern, which is easy to address, is whether the device generates ultrasonic sounds that rodents can hear when idle or operational, across the ultrasonic frequencies that are of biological relevance (20-110 kHz). These sounds are generally alarm vocalizations and can create stress in animals, and/or serve as cues of an impending stimulus (if indeed they are produced by the device).

      The reviewer brings up an interesting question. The ARM does not make a lot of noise, but some of the noise it emits does range into the 20-110 kHz range, though besides this does not qualitatively have other similarities to a mouse vocalization. Based on this we tested whether the noise produced by the ARM causes stress in naïve mice.

      “A concern was raised that the noise of the ARM may cause stress in the mice tested. To test this, the open field test was performed with naïve mice (n=10) 2 feet from the ARM while the ARM either sat silent or ran through its habituation program, producing noise. The mouse's center point movement was then tracked in relation to the chamber, its edges, and center. No significant differences were found in distance traveled, center entrances, center, time in center, and latency to center entrance based on a student’s two-tailed t-test (Figure S1D-G). Based on this, neither stress nor locomotion differences were detected by this test, indicating the ARM does not induce an increased stress state due to its noise, even in non-habituated mice.”

      (3) This sentence in the intro may be inaccurate: "or the recent emergence of a therapeutic targeting voltage-gated sodium channels, that block pain in both rodents and humans such as VX-548 for NaV1.8 (Jones 2023)" Despite extensive searching, I have been unable to find a reference showing that VX-548 is antinociceptive in rodents (rats or mice). As for why this is the case, I do not know. One speculation: this drug may be selective for the human Nav1.8 channel (but again, I have found no references comparing specificity on human vs rodent Nav1.8 channels). To not mislead the field into thinking VX-548 works for rodents and humans, please remove "both rodents and" from the sentence above (unless you find a reference supporting VX-548 as being effective in pain assays with rodents. There is a PK/PD paper with rodents, but that only looks at drug metabolism, not efficacy with pain assays).

      We agree with the reviewer and have removed mention of the new Nav1.8 therapeutic also working in rodents.

      (4) In the intro paragraph where variability in measuring mechanical stimuli is described, there is a new reference from the Stucky lab that further supports the need for an automated way to measure allodynia, as they also found variability between experimenters. This would be a relevant reference to include: Rodriguez Garcia (2024) PMID: 38314814.

      Thanks to the reviewer for this relevant citation and we have updated the text to incorporate this:

      “Recent studies utilizing the manual highspeed analysis of withdrawal behavior analysis developed by Abdus-Saboor et al. 2019 has reproduced this sizable experimenter effect using the new technique. (Rodríguez García 2024)”

      (5) "a simple sin wave motion": should be "sine", correct throughout (multiple instances of "sin")

      Corrections made where relevant.

      Reviewer #2 (Public review):

      (1) ARM seems like a fantastic system that could be widely adopted, but no details are given on how a lab could build ARM, thus its usefulness is limited.

      The reviewer raises a good point, unfortunately the authors are constrained by university policies around patent law. That said efforts are being made to make the ARM widely available to interested researchers. As mentioned above to Reviewer 1’s comments, we end the discussion section with this sentence:

      “To promote wide adaptation of this device across as many labs as possible, a company, Tactorum Inc., has been formed.”

      (2) The ARM system appears to stop short of hitting the desired forces that von Frey filaments are calibrated toward (Figure 2). This may affect the interpretation of results.

      The reviewer gives an important observation. We amended the text to include more clarity on the max forces induced, and comments on causes beyond the delivery mechanism. It should be noted that a newly bought fresh set of von Frey’s was used.

      “With the same 1.4 and 2 g von Frey filaments Researcher 1 delivered max average forces of 1.5 g and 2.7 g, and Researcher 2 1.35 g and 2.4 g. The ARM delivered average max forces closest to the targeted forces, with 1.36 g and 1.9 g. (Figure 2C) Some of the error observed could be due to the error rate (+/- 0.05 g) in the force gauge and the von Frey set used.”

      (3) The authors mention that ARM generates minimal noise; however, if those sounds are paired with stimulus presentation, they could still prompt a withdrawal response. Including some 'catch' trials in an experiment could test for this.

      The reviewer makes a very useful suggestion that we incorporated into our carrageenan experiments. This new data can be found in Supplemental Figure 3F.

      “For the carrageenan model, three replicates of the force ramp stimulus were delivered to each paw, and catch trials were performed every 3<sup>rd</sup> trial to test whether the mice would respond to the noise of the ARM alone. During catch trials, the stimulus was delivered to the open air behind the mouse, and any movement within 5 seconds of stimulus delivery was counted as a response. These trials found a 96% response rate in true trials, with only a 7% rate in catch trials, indicating responses were not being driven by device noise.”

      (4) The experimental design in Figure 2 is unclear- did each experimenter have their own cohort of 10 mice, or was a single cohort of mice shared? If shared, there's some concern about repeat testing.

      Further clarification was added to avoid confusion on the methods used here.

      “Separate cohorts of 10 mice were used for ARM and manual delivery, with a week given between each researcher to avoid sensitization.”

      (5) In Figure 5 and S4, the order of the legends does not match the order of the graphs. This can be particularly confusing as the color scheme is not colorblind-friendly. Please consider revising the presentation of these figures.

      Corrections made where relevant.

      Reviewer #3 (Public review):

      (1) Limited details are provided for statistical tests and inappropriate claims are cited for individual tests. For example, in Figure 2, differences between researchers at specific forces are reported to be supported by a 2-way ANOVA; these differences should be derived from a post-hoc test that was completed only if the independent variable effects (or interaction effect) were found to be significant in the 2-way ANOVA. In other instances, statistical test details are not provided at all (e.g., Figures 3B, 3C, Figure 4, Figure 6G).

      We would like to thank the reviewer for pointing out the lack of clarity in the text on these statistical methods. We have added further details across the manuscript and shown below here in order to address this concern.

      “Both manual delivery and the ARM produced significant paw withdrawal percentage curves, a standard traditional measurement of mechanical sensitivity in the field (von Frey 1896, Dixon 1980, Chaplan 1994)(Figure 2E), with a 2-way ANOVA and a posthoc Tukey test detecting significant increases in comparing the 3 lower force VFH’s (0.02g, 0.07g, 0.16g) to the 2 highest force VFH’s (1g, 1.4g). This demonstrates that the ARM delivers results comparable to highly experienced researchers. However, a 2-way ANOVA and a posthoc Tukey test found that Researcher 2 elicited a significantly higher (p=0.0008) paw withdrawal frequency than Researcher 1 (Figure S2A) which corresponded with Researcher 2’s higher VFH application time as measured by the force sensor (Figure 2B).”

      “Adjustments were then made to the PAWS software to automate the measurement of withdrawal latency based on pose tracking data of the withdrawal response and the trajectory of the stimulus delivery encoded into the ARM. Testing of C57/BL6J (n=15) at baseline found significant decreases in withdrawal latency for pinprick compared to cotton swab stimuli delivered in identical ways by the ARM (Figure 3B) based on a 2-tailed student t-test.”

      “Mice injected with carrageenan (n=15) showed elevated shaking behavior (p=0.0385, 2-way ANOVA and a posthoc Tukey test) in response to pinprick stimuli in comparison to measurements at baseline (Figure 3C).”

      “Remote habituated mice showed a significant decrease (p=0.0217, 2-way ANOVA) in time to rest over the 3 days (Figure 4B), but no significant differences for any single day. The number of turns was measured for each group during the first 10 minutes of day 1 to act as a baseline, and then from 20 to 30 minutes for each day. Turn counts were then compared as a percentage of the baseline count for each group. This period was chosen as it the period when experiments start after the day of habituation on experimental days. It was found that remote-habituated mice showed significantly less turning on day 2 compared to mice habituated with a researcher present (p=0.024, 2-way ANOVA posthoc Tukey test), and that only the remote-habituated mice showed significantly decreased turning behavior on day 3 compared to day 1 (p=0.0234, 2-way ANOVA posthoc Tukey test) (Figure 4C).”

      “Sex-dependent differences were found in reflexive and affective behavioral components of the mouse withdrawal response when a researcher was present versus not for both reactions to innocuous and noxious stimuli. A 2-way ANOVA and a posthoc Tukey test found that cotton swab stimuli elicited increased male mouse reflexive paw withdrawal features, including max paw height (p=0.0413) and max paw velocity (Y-axis) (p=0.0424) when Researcher 1 was present compared to when no researcher was present (Figure 4E-F). Pinprick stimuli (Figure 4H-I) on the other hand led to increased max paw height (p=0.0436) and max paw velocity (Y-axis) (p=0.0406) in male mice compared to female mice when Researcher 1 was present.

      Analysis of the shaking behavior elicited by cotton swab and pinprick stimuli found no significant differences in shaking behavior duration (Figure 4SA-B) but found sex-dependent differences in paw distance traveled after the initial withdrawal, including during shaking and guarding behaviors. For cotton swab (Figure 4G) male mice showed significantly increased paw distance traveled compared to female mice when Researcher 2 was present (p=0.0468, 2-way ANOVA posthoc Tukey test) but not when Researcher 2 was present or no researcher was present. Pinprick stimuli also elicited sex-based increases in paw distance traveled (Figure 4J) in male mice when Researcher 2 was present compared to both male mice when no researcher was present (p=0.0149, 2-way ANOVA posthoc Tukey test) and female mice when Researcher 1 was present (p=0.0038, 2-way ANOVA posthoc Tukey test).”

      (2) In the current manuscript, the effects of the experimenter's presence on both habituation time and aspects of the withdrawal reflex are minimal for Researcher 2 and non-existent for Research 1. This is surprising given that Researcher 2 is female; the effect of experimenter presence was previously documented for male experiments as the authors appropriately point out (Sorge et al. PMID: 24776635). In general, this argument could be strengthened (or perhaps negated) if more than N=2 experiments were included in this assessment.

      The reviewer makes an important point regarding this data and the need for further experiments. We designed a new set of experiments to examine the effect of male and female researchers overall. It should be noted that this is rather noisy data given it was collected by three sets of male and female researchers over 3 weeks. That said a significant difference was found between mouse sexes when a male researcher was present. This is consistent with previous data, but as we discuss this does not invalidate previous data as researcher gender appears to be only one of the factors at work in researcher presence effects on mouse behavior, leading to individuals having the potential for greater or lesser effects than their overall gender. Our new results can be found in Figure 4K.

      “These results indicate that researcher presence at baseline can lead to significant differences in reflexive and affective pain behavior. In this case, male mice showed increased behavioral responses to both touch and pain behavior depending on whether the researcher was present. This led to sex differences in the affective and reflexive component of the withdrawal response when a researcher is present, which disappears when no researcher is present, or a different researcher is present. For this set of researchers, the female researcher elicited the greater behavioral effect. This appeared at first to contradict previous findings (Sorge 2024, Sorge 2014), but it was hypothesized that the effect of an individual researcher could easily vary compared to their larger gender group. To test this, 6 new researchers, half male and half female, were recruited and a new cohort of mice (n=15 male, n=15 female) was tested in each of their presence over the course of 3 weeks, controlling for circadian rhythms (Figure 4K). The newly added force ramp stimulus type was used for these experiments, with three replicates per trial, to efficiently measure mechanical threshold in a manner comparable to previous work. It was found that female mice showed significantly decreased mechanical threshold compared to male mice (p=0.034, Šídák's multiple comparisons test and student’s t-test) when a male researcher was present. This did not occur when a female researcher or no researcher was present. In the latter case of slight trend towards this effect was observed, but it was not significant (p=0.21), and may be the result of a single male researcher being responsible for handling and setting up the mice for all experiments.”

      “These findings indicate that sex-dependent differences in evoked pain behavior can appear and disappear based on which researcher/s are in the room. There is a trend towards male researchers overall having a greater effect, but individuals may have a greater or lesser effect on mouse behavior, independent of the gender or sex. This presents a confound that must be considered in the analysis of sex differences in pain and touch behavior which may explain some of the variation in findings from different researchers. Together, these results suggest that remote stimulus delivery may be the best way to eliminate variation caused by experimenter presence while making it easier to compare with data from researchers in your lab and others.”

      (3) The in vivo BLA calcium imaging data feel out of place in this manuscript. Is the point of Figure 6 to illustrate how the ARM can be coupled to Inscopix (or other external inputs) software? If yes, the following should be addressed: why do the up-regulated and down-regulated cell activities start increasing/decreasing before the "event" (i.e., stimulus application) in Figure 6F? Why are the paw withdrawal latencies and paw distanced travelled values in Figures 6I and 6J respectively so much faster/shorter than those illustrated in Figure 5 where the same approach was used?

      Thanks to the reviewer for bringing up this concern. We have included further text discussing this behavioral data and how it compares to previous work in this study.

      “Paw height and paw velocity were found to be consistent with data from figures 4E-I (male researcher and male mice) and 5C (stimulus intensity 2.5 and 4.5) for similar data, with slightly elevated measures of paw distance traveled and decreased paw withdrawal latency for the pinprick stimulus. This was likely caused by sensitization due to multiple stimulus deliveries over the course of the experiment, as due to logistics, 30 stimulus trials were delivered per session due to logistical constraints vs the max of 3 that were performed during previous experiments.”

      “This data indicates that the ARM is an effective tool for efficiently correlating in vivo imaging data with evoked behavioral data, including sub-second behavior. One limitation is that the neural response appears to begin slightly before stimulus impact (Figure 6F, 6SB). This was likely caused by a combination of the imprecise nature of ARM v1 paw contact detection and slight delays in the paw contact signal reaching the Inscopix device due to flaws in the software and hardware used, slowing down the signal. Improvements have been made to eliminate this delay as part of the ARM v2, which have been shown to eliminate this delay in vivo fiber photometry data recorded as part of new projects using the device.”

      (4) Another advance of this manuscript is the integration of a 500 fps camera (as opposed to a 2000 fps camera) in the PAWS platform. To convince readers that the use of this more accessible camera yields similar data, a comparison of the results for cotton swabs and pinprick should be completed between the 500 fps and 2000 fps cameras. In other words, repeat Supplementary Figure 3 with the 2000 fps camera and compare those results to the data currently illustrated in this figure.

      The reviewer makes a good point about the need for direct comparison between 500 fps and 2000 fps data. To address this we added data from same mice, from 2 weeks prior with a comparable set up. These new results can be found in Supplemental Figure 3.

      “Changes were made to PAWS to make it compatible with framerates lower than 2000 fps. This was tested using a 0.4 MP, 522 FPS, Sony IMX287 camera recording at 500 fps, and data recorded at 2000 fps by the previously used photron fastcam (Figure 3SC-F). The camera paired with PAWS was found to be sufficient to separate between cotton swab and pinprick withdrawal responses, suggesting it may be a useful tool for labs that cannot invest in a more expensive device. PAWS features measured from 500 fps video data were not significantly different from the 2000 fps data based on a 2 way ANOVA.”

      (5) In Figure 2F, the authors demonstrate that a von Frey experiment can be completed much faster with the ARM vs. manually. I don't disagree with that fact - the data clearly show this. I do, however, wonder if the framing of this feature is perhaps too positive; many labs wait > 30 s between von Frey filament applications to prevent receptive field sensitization. The fact that an entire set of ten filaments can be applied in < 50 s (< 3 s between filaments given that each filament is applied for 2 s), while impressive, may never be a feature that is used in a real experiment.

      The reviewer makes an important point about how different researchers perform these tests and the relevant timings. We have moderated the framing of these results to address this concern.

      “Further, we found that the ARM decreased the time needed to apply a stimulus 10 times to a mouse paw by 50.9% compared to manual delivery (Figure 2F). This effect size may decrease for researchers who leave longer delays between stimulus delivery, but the device should still speed up experiments by reducing aiming time and allowing researchers to quickly switch to a new mouse while waiting for the first.”

      (6) Why are different affective aspects of the hindpaw withdrawal shown in different figures? For example, the number of paw shakes is shown in Figure 3C, whereas paw shaking duration is shown in Figure 5D. It would be helpful - and strengthen the argument for either of these measures as being a reproducible, reliable measure of pain - if the same measure was used throughout.

      Thanks to the reviewer for pointing out this discrepancy. We have adjusted the figures and text to only use the Number of Paw Shakes for better consistency (Figure 5D and Figure 5-figure supplement 1C).

      (7) Is the distance the paw traveled an effective feature of the paw withdrawal (Figure 5E)? Please provide a reference that supports this statement.

      A relevant citation and discussion of this metric based on previous studies has been added.

      “Mice injected with carrageenan (n=15) showed elevated shaking behavior (p=0.0385) in response to pinprick stimuli in comparison to measurements at baseline (Figure 3C). This aligned with previous findings where PAWS has detected elevations in shaking and/or guarding behavior, examples of affective pain behavior, and post-peak paw distance traveled, which correlates with these behaviors in carrageenan pain models and has been to found to be a good measure of them in past studies (Bohic et al. 2023).”

      (8) Dedek et al. (PMID: 37992707) recently developed a similar robot that can also be used to deliver mechanical stimuli. The authors acknowledge this device's ability to deliver optogenetic and thermal stimuli but fail to mention that this device can deliver mechanical stimuli in a similar manner to the device described in this paper, even without experimenter targeting. Additional discussion of the Dedek et al. device is warranted.

      We would like to thank the reviewer for identifying  this omission. Discussion of this as well as further discussion of Dedek et al.’s automation prototyping work has been added.

      “Previous attempts at automating mechanical stimulus delivery, including the electronic von Frey (Martinov 2013) and dynamic plantar asthesiometer (Nirogi 2012), have focused on eliminating variability in stimulus delivery. In contrast to the ARM, both of these devices rely upon a researcher being present to aim or deliver the stimulus, can only deliver vFH-like touch stimuli, and only measure withdrawal latency/force threshold. Additionally, progress has been made in automating stimulus assays by creating devices with the goal of delivering precise optogenetic and thermal stimuli to the mouse’s hind paw (Dedek 2023, Schorscher-Petchu 2021). The Prescott team went farther and incorporated a component into their design to allow for mechanical stimulation but this piece appears to be limited to a single filament type that can only deliver a force ramp. As a result these devices and those previously discussed lack of customization for delivering distinct modalities of mechanosensation that the ARM allows for. Moreover, in its current form the automated aiming of some of these devices may not provide the same resolution or reliability of the ARM in targeting defined targets (Figure 1C), such as regions of the mouse paw that might be sensitized during chronic pain states. Due to the nature of machine learning pose estimation, substantial work, beyond the capacity of a single academic lab, in standardizing the mouse environment and building a robust model based on an extensive and diverse training data set will be necessary for automated aiming to match the reliability or flexibility of manual aiming. That said, we believe this work along with that of that of the other groups mentioned has set the groundwork from which a new standard for evoked somatosensory behavior experiments in rodents will be built.”

      (9) Page 2: von Frey's reference year should be 1896, not 1986.

      This typo has been fixed, thanks to the reviewer for noting it.

      “For more than 50 years, these stimuli have primarily been the von Frey hair (vFH) filaments that are delivered to the mouse paw from an experimenter below the rodent aiming, poking, and subsequently recording a paw lift (von Frey 1896, Dixon 1980, Chaplan 1994).”

      (10) Page 2: Zumbusch et al. 2024 also demonstrated that experimenter identification can impact mechanical thresholds, not just thermal thresholds.

      Text has been updated in order to note this important point.

      “A meta-analysis of thermal and mechanical sensitivity testing (Chesler 2002, Zumbusch 2024) found that the experimenter has a greater effect on results than the mouse genotype, making data from different individual experimenters difficult to merge.”

      (11) Page 2: One does not "deliver pain in the periphery". Noxious stimuli or injury can be delivered to the periphery, but by definition, pain is a sensation that requires a central nervous system.

      Text has been updated for improved accuracy.

      “Combining approaches to deliver painful stimuli with techniques mapping behavior and brain activity could provide important insights into brain-body connectivity that drives the sensory encoding of pain.”

    1. bens

      **Jurisprudência em Teses - Edição nº 124 - BENS PÚBLICOS**

      • 1) Os bens integrantes do acervo patrimonial de sociedades de economia mista sujeitos a uma destinação pública <u>equiparam-se</u> a bens públicos, sendo, portanto, insuscetíveis de serem adquiridos por meio de usucapião.

      • 2) Os imóveis administrados pela Companhia Imobiliária de Brasília - Terracap são públicos e, portanto, insuscetíveis de aquisição por meio de usucapião.

      • 3) O imóvel vinculado ao Sistema Financeiro de Habitação - SFH, porque afetado à prestação de serviço público, <u>deve ser</u> tratado como bem público, não podendo, pois, ser objeto de usucapião.

      • 4) É possível reconhecer a usucapião do domínio útil de bem público sobre o qual tinha sido, anteriormente, instituída enfiteuse, pois, nessa circunstância, existe apenas a substituição do enfiteuta pelo usucapiente, não havendo qualquer prejuízo ao Estado.

      • 5) É incabível a modificação unilateral pela União do valor do domínio pleno de imóvel aforado, incidindo somente a correção monetária na atualização anual do pagamento do foro na enfiteuse de seus bens (art. 101 do Decreto-Lei n. 9760/1946).

      • 6) As concessões de terras devolutas situadas na faixa de fronteira, feitas pelos Estados, autorizam, <u>apenas</u>, o uso, permanecendo o domínio com a União, ainda que se mantenha inerte ou tolerante, em relação aos possuidores. (Súmula n. 477/STF)

      • 7) Terras em faixas de fronteira e aquelas sem registro imobiliário não são, por si só, terras devolutas, cabendo ao ente federativo comprovar a titularidade desses terrenos.

      • 8) O descumprimento de encargo estabelecido em lei que determinara a doação de bem público enseja, <u>por si só</u>, a sua desconstituição.

      • 9) A ocupação indevida de bem público configura mera detenção, de natureza precária, insuscetível de retenção ou indenização por acessões e benfeitorias. (Súmula n. 619/STJ)

      • 10) Construção ou atividade irregular em bem de uso comum do povo revela <u>dano presumido</u> à coletividade, dispensada prova de prejuízo em concreto.

      • 11) Os registros de propriedade particular de imóveis situados em terrenos de marinha não são oponíveis à União. (Súmula n. 496/ STJ) (Tese julgada sob o rito do art. 543-C do CPC/73 - TEMA 419)

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review): 

      Summary: 

      Ferreiro et al. present a method to simulate protein sequence evolution under a birth-death model where sequence evolution is guided by structural constraints on protein stability. The authors then use this model to explore the predictability of sequence evolution in several viral proteins. In principle, this work is of great interest to molecular evolution and phylodynamics, which has struggled to couple non-neutral models of sequence evolution to phylodynamic models like birth-death processes. Unfortunately, though, the model shows little improvement over neutral models in predicting protein sequence evolution, although it can predict protein stability better than models assuming neutral evolution. It appears that more work is needed to determine exactly what aspects of protein sequence evolution are predictable under such non-neutral phylogenetic models. 

      We thank the reviewer for the positive comments about our work. We agree that further work is needed in the field of substitution models of molecular evolution to enable more accurate predictions of specific amino acid sequences in evolutionary processes.

      Major concerns: 

      (1) The authors have clarified the mapping between birth-death model parameters and fitness, but how fitness is modeled still appears somewhat problematic. The authors assume the death rate = 1 - birth rate. So a variant with a birth rate b = 1 would have a death rate d = 0 and so would be immortal and never die, which does not seem plausible. Also I'm not sure that this would "allow a constant global (birth-death) rate" as stated in line 172, as selection would still act to increase the population mean growth rate r = b - d. It seems more reasonable to assume that protein stability affects only either the birth or death rate and assume the other rate is constant, as in the Neher 2014 model. 

      The model proposed by Neher, et al. (2014), which incorporates a death rate (d) higher than 0 for any variant, was implemented and applied in the present method. In general, this model did not yield results different from those obtained using the model that assumes d = 1 – b, suggesting that this aspect may not be crucial for the study system. Next, the imposition of arbitrary death events based on an arbitrary death rate could be a point of concern. Regarding the original model, a variant with d = 0 can experience a decrease in fitness through the mutation process. In an evolutionary process, each variant is subject to mutation, and Markov models allow for the incorporation of mutations that decrease fitness (albeit with lower probability than beneficial ones, but they can still occur). All this information is included in the manuscript.

      (2) It is difficult to evaluate the predictive performance of protein sequence evolution. This is in part due to the fact that performance is compared in terms of percent divergence, which is difficult to compare across viral proteins and datasets. Some protein sequences would be expected to diverge more because they are evolving over longer time scales, under higher substitution rates or under weaker purifying selection. It might therefore help to normalize the divergence between predicted and observed sequences by the expected or empirically observed amount of divergence seen over the timescale of prediction. 

      AU: The study protein datasets showed different levels of sequence divergence over their evolutionary times, as indicated for each dataset in the manuscript. For some metrics, we evaluated the accuracy (or error) of the predictions through direct comparisons between real and predicted protein variants using percentages to facilitate interpretation: 0% indicates a perfect prediction (no error), while 100% indicates a completely incorrect prediction (total error). Regarding normalization of these evaluations, we respectfully disagree with the suggestion because diverse factors can affect (not only the substitution rate, but also the sample size, structural features of the protein that may affect stability when accommodating different sequences, among others) and this complicates defining a consistent and meaningful normalization criterion. Given that the manuscript provides detailed information for each dataset, we believe that the presentation of the prediction accuracy through direct comparisons between real and predicted protein variants, expressed as percentages of similarity, is the clearest way.

      (3) Predictability may also vary significantly across different sites in a protein. For example, mutations at many sites may have little impact on structural stability (in which case we would expect poor predictive performance) while even conservative changes at other sites may disrupt folding. I therefore feel that there remains much work to be done here in terms of figuring out where and when sequence evolution might be predictable under these types of models, and when sequence evolution might just be fundamentally unpredictable due to the high entropy of sequence space. 

      We agree with this reflection. Mutations can have different effects on folding stability, which are accounted for by the model presented in this study. However, accurately predicting the exact sequences of protein variants with similar stability remains difficult with current structurally constrained substitution models, and therefore, further work is needed in this regard. This aspect is indicated in the manuscript.

      We want to thank the reviewer again for taking the time to revise our work and for the insightful and helpful comments.

      Reviewer #2 (Public review): 

      In this study, the authors aim to forecast the evolution of viral proteins by simulating sequence changes under a constraint of folding stability. The central idea is that proteins must retain a certain level of structural stability (quantified by folding free energy, ΔG) to remain functional, and that this constraint can shape and restrict the space of viable evolutionary trajectories. The authors integrate a birth-death population model with a structurally constrained substitution (SCS) model and apply this simulation framework to several viral proteins from HIV-1, SARS-CoV-2, and influenza.

      The motivation to incorporate biophysical constraints into evolutionary models is scientifically sound, and the general approach aligns with a growing interest in bridging molecular evolution and structural biology. The authors focus on proteins where immune pressure is limited and stability is likely to be a dominant constraint, which is conceptually appropriate. The method generates sequence variants that preserve folding stability, suggesting that stability-based filtering may capture certain evolutionary patterns. 

      Correct. We thank the reviewer for the positive comments about our study.

      However, the study does not substantiate its central claim of forecasting. The model does not predict future sequences with measurable accuracy, nor does it reproduce observed evolutionary paths. Validation is limited to endpoint comparisons in a few datasets. While KL divergence is used to compare amino acid distributions, this analysis is only applied to a single protein (HIV-1 MA), and there is no assessment of mutation-level predictive accuracy or quantification of how well simulated sequences recapitulate real evolutionary paths. No comparison is made to real intermediate variants available from extensive viral sequencing datasets which gather thousands of sequences with detailed collection date annotation (SARS-CoV-2, Influenza, RSV). 

      There are several points in this comment.

      The presented method accurately predicts folding stability of forecasted variants, as shown through comparisons between real and predicted protein variants. However, as the reviewer correctly indicates, predicting the exact amino acid sequences remains challenging. This limitation is discussed in detail in the manuscript, where we also suggest that further improvements in substitution models of protein evolution are needed to better capture the evolutionary signatures of amino acid change at the sequence level, even between amino acids with similar physicochemical properties. Regarding the time points used for validation, the studied influenza NS1 dataset included two validation points. A key limitation in increasing the number of time points is the scarcity of datasets derived from monitoring protein evolution with sufficient molecular diversity between samples collected at consecutive time points (i.e., at least more than five polymorphic amino acid sites). 

      As described in the manuscript, calculating Kullback-Leibler (KL) divergence requires more than one sequence per studied time point. However, most datasets in the literature include only a single sequence per time point, typically a consensus sequence derived from bulk population sequencing. Generating multiple sequences per time point is experimentally more demanding, often requiring advanced methods such as single-virus sequencing or amplification of sublineages in viral subpopulations, as was done for the first dataset used in the study (Arenas, et al. 2016), which enabled the calculation of KL divergence. The extent to which the simulated sequences resemble real evolution is evaluated in the method validation. As noted, intermediate time point validation was performed using the influenza NS1 protein dataset. Although, as the reviewer indicates, thousands of viral sequences are available, these are usually consensus sequences from bulk sequencing. Indeed, many viral variants mainly differ through synonymous mutations, where the number of accumulated nonsynonymous mutations is small. For example, from the original Wuhan strain to the Omicron variant, the SARS-CoV-2 proteins Mpro and PLpro accumulated only 10 and 22 amino acid changes, respectively.

      Analyzing intermediate variants of concern (i.e., Gamma or Delta) would reduce this number affecting statistics. In addition, many available viral sequences are not consecutive in evolutionary terms (one dataset does not represent the direct origin of another dataset at a subsequent time point), which further limits their applicability in this study. There is little data from monitored protein evolution with consecutive samples. The most suitable studies usually involve in vitro virus evolution, but the data from these studies often show low genetic variability between samples collected at different time points. Finally, it is important to note that the presented method can only be applied to proteins with known 3D structures, as it relies on selection based on folding stability. Non-structural proteins cannot be analyzed using this approach. Future work could incorporate additional selection constraints, which may improve the accuracy of predictions. These considerations and limitations are indicated in the manuscript.

      The selection of proteins is narrow and the rationale for including or excluding specific proteins is not clearly justified. 

      The viral proteins included in the study were selected based on two main criteria, general interest and data availability. In particular, we included proteins from viruses that affect humans and for which data from monitored protein evolution, with sufficient molecular diversity between consecutive time points, is available. These aspects are indicated in the manuscript.

      The analyzed datasets are also under-characterized: we are not given insight into how variable the sequences are or how surprising the simulated sequences might be relative to natural diversity. Furthermore, the use of consensus sequences to represent timepoints is problematic, particularly in the context of viral evolution, where divergent subclades often coexist - a consensus sequence may not accurately reflect the underlying population structure. 

      The manuscript indicates the sequence identity among protein datasets of different time points, along with other technical details. Next, the evaluation based on comparisons between simulated and real sequences reflects how surprising the simulated sequences might be relative to natural diversity, considering that the real dataset is representative. We believe that the diverse study real datasets are useful to evaluate the accuracy of the method in predicting different molecular patterns. Regarding the use of consensus sequences, we agree that they provide an approximation. However, as previously indicated, most of the available data from monitored protein evolution consist of consensus sequences obtained through bulk sequencing. Additionally, analyzing every individual viral sequence within a viral population, which is typically large, would be ideal but computationally intractable.

      The fitness function used in the main simulations is based on absolute ΔG and rewards increased stability without testing whether real evolutionary trajectories tend to maintain, increase, or reduce folding stability over time for the particular systems (proteins) that are studied. While a variant of the model does attempt to center selection around empirical ΔG values, this more biologically plausible version is underutilized and not well validated.

      The applied fitness function, based on absolute ΔG, is well stablished in the field (Sella and Hirsh 2005; Goldstein 2013). The present study independently predicts ΔG for the real and simulated protein variants at each sampling point. This ΔG prediction accounts not only for negative design, informed by empirical data, but also for positive design based on the study data (Arenas, et al. 2013; Minning, et al. 2013), thereby enabling the detection of variation in folding stability among protein variants. These aspects are indicated in the manuscript. Therefore, in our view, the study provides a proper comparison of real and predicted evolutionary trajectories in terms of folding stability.

      Ultimately, the model constrains sequence evolution to stability-compatible trajectories but does not forecast which of these trajectories are likely to occur. It is better understood as a filter of biophysically plausible outcomes than as a predictive tool. The distinction between constraint-based plausibility and sequence-level forecasting should be made clearer. Despite these limitations, the work may be of interest to researchers developing simulation frameworks or exploring the role of protein stability in viral evolution, and it raises interesting questions about how biophysical constraints shape sequence space over time. 

      The presented method estimates the fitness of each protein variant, which can reflect the relative survival capacity of the variant. Therefore, despite the error due to evolutionary constraints not considered by the method, it indicates which variants are more likely to become fixed over time. In our view, the method does not merely filter plausible variants, rather, it generates predictions of variant survival through predicted fitness based on folding stability and simulations of protein evolution under structurally constrained substitution models integrated with birth-death population genetics approaches. The use of simulation-based approaches for prediction is well established in population genetics. For example, approaches such as approximate Bayesian computation (Beaumont, et al. 2002) rely on this strategy, and it has also been applied in other studies of forecasting evolution (e.g., Neher, et al. 2014). We believe that the distinction between forecasting folding stability and amino acid sequence is clearly shown in the manuscript, including the main text and the figures.

      Reviewer #2 (Recommendations for the authors): 

      I thank the authors for addressing the question about template switching, their clarification was helpful. However, the core concerns I raised remain unresolved: the claim that the method is useful for forecasting is not substantiated.  In order to support the paper's central claims or to prove its usefulness, several key improvements could be incorporated: 

      (1) Systematic analysis of more proteins: 

      The manuscript would be significantly strengthened by a systematic evaluation of model performance across a broader set of viral proteins, beyond the examples currently shown. Many human influenza and SARS-CoV-2 proteins have wellcharacterized structures or high-quality homology templates, making them suitable candidates. In the light of limited success of the method, presenting the model's behavior across a more comprehensive protein set, including those with varying structural constraints and immune pressures, would help assess generalizability and clarify the specific conditions under which the model is applicable. 

      Following a comment from the reviewer in a previous revision of the study, we included the analysis of an influenza NS1 protein dataset that contains two evaluation time points. Next, to validate the prediction method, it is necessary to have monitored protein sequences collected at least at two consecutive time points, with sufficient divergence between them to capture evolutionary signatures that allow for proper evaluation. Additionally, many data involve sequences that are not consecutive in evolutionary terms (one dataset is not a direct ancestor of another dataset existing at a posterior time point), which disallows their applicability in this study. Little data from monitored protein evolution with trustable consecutive (ancestor-descendant) samples exist. The most suitable studies often involve in vitro virus evolution, but they usually show low genetic variability between samples collected at different time points. Although thousands of sequences are available for some viruses, they are usually consensus sequences from bulk sequencing and often show a low number of nonsynonymous mutations at the study protein-coding gene between time points. For example, from the original Wuhan strain and the Omicron variant, the SARS-CoV-2 proteins Mpro and PLpro accumulated only 10 and 22 amino acid changes, respectively. Analyzing intermediate variants of concern (i.e., Gamma or Delta) would reduce this number affecting statistics. Thus, in practice, we found scarcity of data derived from monitoring protein evolution, with trustable ancestor and corresponding descendant data at consecutive time points and with sufficient molecular diversity between them (i.e., at least more than five polymorphic amino acid sites). In all, we believe that the diverse viral protein datasets used in the present study, along with the multiple analyzed datasets collected from monitored HIV-1 populations present in different patients, provide a representative application of the method, since notice that similar patterns were generally generated from the analysis of the different datasets.

      (2) Present clear data statistics: For each analyzed dataset, the authors should provide basic information about the number of unique sequences, levels of variability, and evolutionary divergence between start and end sequences. This would contextualize the forecasting task and clarify whether the simulations are non-trivial. In particular, it should be shown that the consensus sequence is indeed representative of the viral population at a given time point. In viral evolution we frequently observe co-circulation of subclades and the consensus sequence is then not representative. 

      For each dataset analyzed, the manuscript provides the sequence identity between samples at the study time points (which also informs about sequence variability), sample sizes, representative protein structure, and other technical details. The study assumes that consensus sequences, typically generated by bulk sequencing, are representative of the viral population. Next, samples at different time points should involve ancestor-descendant relationships, which is a requirement and one of the limitations to find appropriate data for this study, as noted in our previous response.

      (3) Explore other metrics for population level sequence comparison: 

      In the light of possible existence of subclades, mentioned above, the currently used metrics for sequence comparison may underestimate performance of the simulations. It would be sufficient to see some overlap of simulated clades and and the observed clades. 

      We found this to be a good idea. However, in practice, we believe that the criteria used to define subclades could introduce biases into the results. For some metrics, we evaluated the accuracy of the predictions through direct comparisons between all real and predicted protein variants, using percentages to facilitate interpretation. We believe that using subclades could potentially reduce the current prediction errors, but this would complicate the interpretation of the results, as they would be influenced by the subjective criteria used to define the subclades.

      Currently, the manuscript presents a plausible filtering framework rather than a predictive model. Without these additional analyses, the main claims remain only partially supported. 

      Please see our reply to the comment of the reviewer just before the section titled “Recommendations for the authors”.

      Response to some rebuttal statements: 

      (1) "Sequence comparisons based on the KL divergence require, at the studied time point, an observed distribution of amino acid frequencies among sites and an estimated distribution of amino acid frequencies among sites. In the study datasets, this is only the case for the HIV-1 MA dataset, which belongs to a previous study from one of us and collaborators where we obtained at least 20 independent sequences at each sampling point (Arenas, et al. 2016)" 

      The available Influenza and SARS-CoV-2 data gathers isolates annotated with exact collection dates, providing reach datasets for such analysis. 

      The available influenza and SARS-CoV-2 sequences are typically derived from bulk sequencing and, therefore, they are consensus sequences. As a result, they cannot be used to calculate KL divergence. Additionally, many of the indicated sequences from databases are not demonstrated to be consecutive in evolutionary terms (one dataset is not a direct ancestor of another dataset existing at a posterior time point), which disallows their applicability in this study. The most suitable studies often involve in vitro virus evolution, but they usually show low genetic variability between samples collected at different time points.

      (2) "Regarding extending the analysis to other time points (other variants of concern), we kindly disagree because Omicron is the variant of concern with the highest genetic distance to the Wuhan variant, and a high genetic distance is  required to properly evaluate the prediction method." 

      There have been many more variants of concern subsequent to Omicron which circulated in 2021. 

      A key aspect is the accumulation of diversity in the study proteins across different time points. The SARS-CoV-2 proteins Mpro and PLpro accumulated only 10 and 22 amino acid changes from the original Wuhan variant to Omicron, respectively.

      Analyzing intermediate variants of concern (e.g., Gamma or Delta) or those closely related to Omicron would reduce the number of accumulated mutations even further.   

      We want to thank the reviewer again for taking the time to revise our work and for the insightful and helpful comments.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      Ferreiro et al. present a method to simulate protein sequence evolution under a birth-death model where sequence evolution is constrained by structural constraints on protein stability. The authors then use this model to explore the predictability of sequence evolution in several viral structural proteins. In principle, this work is of great interest to molecular evolution and phylodynamics, which have struggled to couple non-neutral models of sequence evolution to phylodynamic models like birth-death. Unfortunately, though, the model shows little improvement over neutral models in predicting protein evolution, and this ultimately appears to be due to fundamental conceptual problems with how fitness is modeled and linked to the phylodynamic birth-death model. 

      AU: We thank the reviewer for the positive comments about our work.

      Regarding predictive power, the study showed a good accuracy in predicting the real folding stability of forecasted protein variants under a selection model, but not under a neutral model. Next, predicting the exact sequences was more challenging. In this revised version, where we added additional real data, we found that the accuracy of this prediction can vary among proteins (i.e., the SCS model was more accurate than the neutral model in predicting sequences of the influenza NS1 protein at different time points). Still, we consider that efforts are required in the field of substitution models of molecular evolution. For example, amino acids with similar physicochemical properties can result in predictions with appropriate folding stability while different specific sequence. The development of accurate substitution models of molecular evolution is an active area of research with ongoing progress, but further efforts are still needed. Next, forecasting the folding stability of future real proteins is fundamental for proper forecasting protein evolution, given the essential role of folding stability in protein function and its variety of applications. Regarding the conceptual concerns related to fitness modeling, we clarify them in detail in our responses to the specific comments below.

      Major concerns:

      (1) Fitness model: All lineages have the same growth rate r = b-d because the authors assume b+d=1. But under a birth-death model, the growth r is equivalent to fitness, so this is essentially assuming all lineages have the same absolute fitness since increases in reproductive fitness (b) will simply trade off with decreases in survival (d). Thus, even if the SCS model constrains sequence evolution, the birthdeath model does not really allow for non-neutral evolution such that mutations can feed back and alter the structure of the phylogeny. 

      We thank the reviewer for this comment that aims to improve the realism of our model. In the model presented (but see later another model, derived from the proposal of the reviewer, that we have now implemented into the framework and applied it to the study data), the fitness predicted from a protein variant is used to obtain the corresponding birth rate of that variant. In this way, protein variants with high fitness have high birth rates leading to overall more birth events, while protein variants with low fitness have low birth rates resulting in overall more extinction events, which has biological meaning for the study system. The statement “All lineages have the same growth rate r = b-d” in our model is incorrect because, in our model, b and d can vary among lineages according to the fitness. For example, a lineage might have b=0.9, d=0.1, r=0.8, while another lineage could have b=0.6, d=0.4, r=0.2. Indeed, the statement “this is essentially assuming all lineages have the same absolute fitness” is incorrect. Clearly, assuming that all lineages have the same fitness would not make sense, in that situation the folding stability of the forecasted protein variants would be similar under any model, which is not the case as shown in the results. In our model, the fitness affects the reproductive success, where protein variants with a high fitness have higher birth rates leading to more birth events, while those with lower fitness have higher death rates leading to more extinction events. This parameterization is meaningful for protein evolution because the fitness of a protein variant can affect its survival (birth or extinction) without necessarily affecting its rate of evolution. While faster growth rate can sometimes be associated with higher fitness, a variant with high fitness does not necessarily accumulate substitutions at a faster rate. Regarding the phylogenetic structure, the model presented considers variable birth and death events across different lineages according to the fitness of the corresponding protein variants, and this affects the derived phylogeny (i.e., protein variants selected against can go extinct while others with high fitness can produce descendants). We are not sure about the meaning of the term “mutations can feed back” in the context of our system. Note that we use Markov models of evolution, which are well-stablished in the field (despite their limitations), and substitutions are fixed mutations, which still could be reverted later if selected by the substitution model (Yang 2006). Altogether, we find that the presented birth-death model is technically correct and appropriate for modeling our biological system. Its integration with structurally constrained substitution (SCS) models of protein evolution as Markov models follows general approaches of molecular evolution in population genetics (Yang 2006; Carvajal-Rodriguez 2010; Arenas 2012; Hoban, et al. 2012). We have now provided a more detailed description of the models in the manuscript.

      Apart from these clarifications about the birth-death model used, we could understand the point of the reviewer and following the suggestion we have now incorporated an additional birth-death model that accounts for variable global birth-death rate among lineages. Specifically, we followed the model proposed by Neher et al (2014), where the death rate is considered as 1 and the birth rate is modeled as 1 + fitness. In this model, the global birth-death rate can vary among lineages. We implemented this model into the computer framework and applied it to the data used for the evaluation of the models. The results indicated that, in general, this model yields similar predictive accuracy compared to the previous birth-death model. Thus, accounting for variability in the global birth-death rate does not appear to play a major role in the studied systems of protein evolution. We have now presented this additional birth-death model and its results in the manuscript.

      (2) Predictive performance: Similar performance in predicting amino acid frequencies is observed under both the SCS model and the neutral model. I suspect that this rather disappointing result owes to the fact that the absolute fitness of different viral variants could not actually change during the simulations (see comment #1). 

      As indicated in our previous answer, our study shows a good accuracy in predicting the real folding stability of forecasted protein variants under a selection model, but not under a neutral model. Next, predicting the exact sequences was more challenging, which was not surprising considering previous studies. In particular, inferring specific sequences is considerably challenging even for ancestral molecular reconstruction (Arenas, et al. 2017; Arenas and Bastolla 2020). Indeed, observed sequence diversity is much greater than observed structural diversity (Illergard, et al. 2009; Pascual-Garcia, et al. 2010), and substitutions between amino acids with similar physicochemical properties can yield modeled protein variants with more accurate folding stability, even when the exact amino acid sequences differ. As indicated, further work is demanded in the field of substitution models of molecular evolution. Next, in this revised version, where we included analyses of additional real datasets, we found that the accuracy of sequence prediction can vary among datasets. Notably, the analysis of an influenza NS1 protein dataset, with higher diversity than the other datasets studied, showed that the SCS model was more accurate than the neutral model in predicting sequences across different time points. Datasets with relatively high sequence diversity can contain more evolutionary information, which can improve prediction quality. In any case, as previously indicated, we believe that efforts are required in the field of substitution models of molecular evolution. Apart from that, forecasting the folding stability of future real proteins is an important advance in forecasting protein evolution, given the essential role of folding stability in protein function (Scheiblhofer, et al. 2017; Bloom and Neher 2023) and its variety of applications.

      Next, also as indicated in our previous response, the birth-death model used in this study accounts for variation in fitness among lineages producing variable reproductive success. The additional birth-death model that we have now incorporated, which considers variation of the global birth-death rate among lineages, produced similar prediction accuracy, suggesting a limited role in protein evolution modeling. Molecular evolution parameters, particularly the substitution model, appear to be more critical in this regard. We have now included these aspects in the manuscript.

      (3) Model assessment: It would be interesting to know how much the predictions were informed by the structurally constrained sequence evolution model versus the birth-death model. To explore this, the authors could consider three different models: 1) neutral, 2) SCS, and 3) SCS + BD. Simulations under the SCS model could be performed by simulating molecular evolution along just one hypothetical lineage. Seeing if the SCS + BD model improves over the SCS model alone would be another way of testing whether mutations could actually impact the evolutionary dynamics of lineages in the phylogeny. 

      In the present study, we compared the neutral model + birth-death (BD) with the SCS model + BD. Markov substitution models Q are applied upon an evolutionary time (i.e., branch length, t) and this allows to determine the probability of substitution events during that time period [P(t) = exp (Qt)]. This approach is traditionally used in phylogenetics to model the incorporation of substitution events over time. Therefore, to compare the neutral and SCS models in terms of evolutionary inference, an evolutionary time is required, in this case it is provided by the birth-death process. Thus, the cases 1) and 2) cannot be compared without an underlined evolutionary history. Next, comparisons in terms of likelihood, and other aspects, between models that ignore the protein structure and the implemented SCS models are already available in previous studies based on coalescent simulations or given phylogenetic trees (Arenas, et al. 2013; Arenas, et al. 2015). There, SCS models outperformed models that ignore evolutionary constraints from the protein structure, and those findings are consistent with the results obtained in the present study where we explored the application of these models to forecasting protein evolution. We would like to emphasize that forecasting the folding stability of future real proteins is a significant finding, folding stability is fundamental to protein function and has a variety of applications. We have now indicated these aspects in the manuscript.

      (4) Background fitness effects: The model ignores background genetic variation in fitness. I think this is particularly important as the fitness effects of mutations in any one protein may be overshadowed by the fitness effects of mutations elsewhere in the genome. The model also ignores background changes in fitness due to the environment, but I acknowledge that might be beyond the scope of the current work. 

      AU: This comment made us realize that more information about the features of the implemented SCS models should be included in the manuscript. In particular, the implemented SCS models consider a negative design based on the observed residue contacts in nearly all proteins available in the Protein Data Bank (Arenas, et al. 2013; Arenas, et al. 2015). This data is distributed with the framework, and it can be updated to incorporate new structures (further details are provided in the distributed framework documentation and practical examples). Therefore, the prediction of folding stability is a combination of positive design (direct analysis of the target protein) and negative design (consideration of background proteins from a database to improve the predictions), thus incorporating background molecular diversity. We have now indicated this important aspect in the manuscript. Regarding the fitness caused by the environment, we agree with the reviewer. This is a challenge for any method aiming to forecast evolution, as future environmental shifts are inherently unpredictable and may affect the accuracy of the predictions. Although one might attempt to incorporate such effects into the model, doing so risks overparameterization, especially when the additional factors are uncertain or speculative. We have now mentioned this aspect in the manuscript.

      (5) In contrast to the model explored here, recent work on multi-type birth-death processes has considered models where lineages have type-specific birth and/or death rates and therefore also type-specific growth rates and fitness (Stadler and Bonhoeffer, 2013; Kunhert et al., 2017; Barido-Sottani, 2023). Rasmussen & Stadler (eLife, 2019) even consider a multi-type birth-death model where the fitness effects of multiple mutations in a protein or viral genome collectively determine the overall fitness of a lineage. The key difference with this work presented here is that these models allow lineages to have different growth rates and fitness, so these models truly allow for non-neutral evolutionary dynamics. It would appear the authors might need to adopt a similar approach to successfully predict protein evolution. 

      We agree with the reviewer that robust birth-death models have been developed applying statistics and, in many cases, the primary aim of those studies is the development and refinement of the model itself. Regarding the study by Rasmussen and Stadler 2019, it incorporates an external evaluation of mutation events where the used fitness is specific for the proteins investigated in that study, which may pose challenges for users interested in analyzing other proteins. In contrast, our study takes a different approach. We implement a fitness function that can be predicted and evaluated for any type of structural protein (Goldstein 2013), making it broadly applicable. Actually, in this revised version we added the analysis of additional data of another protein (influenza NS1 protein) with predictions at different time points. In addition, we provide a freely available and well-documented computational framework to facilitate its use. The primary aim of our study is not the development of novel or complex birthdeath models. Rather, we aim to explore the integration of a standard birth-death model with SCS models for the purpose of predicting protein evolution. In the context of protein evolution, substitution models are a critical factor (Liberles, et al. 2012; Wilke 2012; Bordner and Mittelmann 2013; Echave, et al. 2016; Arenas, et al. 2017; Echave and Wilke 2017), and the presented combination with a birth-death model constitutes a first approximation upon which next studies can build to better understand this evolutionary system. We have now indicated these considerations in the manuscript.

      Reviewer #2 (Public review): 

      Summary: 

      In this study, "Forecasting protein evolution by integrating birth-death population models with structurally constrained substitution models", David Ferreiro and coauthors present a forward-in-time evolutionary simulation framework that integrates a birth-death population model with a fitness function based on protein folding stability. By incorporating structurally constrained substitution models and estimating fitness from ΔG values using homology-modeled structures, the authors aim to capture biophysically realistic evolutionary dynamics. The approach is implemented in a new version of their open-source software, ProteinEvolver2, and is applied to four viral proteins from HIV-1 and SARS-CoV-2. 

      Overall, the study presents a compelling rationale for using folding stability as a constraint in evolutionary simulations and offers a novel framework and software to explore such dynamics. While the results are promising, particularly for predicting biophysical properties, the current analysis provides only partial evidence for true evolutionary forecasting, especially at the sequence level. The work offers a meaningful conceptual advance and a useful simulation tool, and sets the stage for more extensive validation in future studies.

      We thank the reviewer for the positive comments on our study. Regarding the predictive power, the results showed good accuracy in predicting the folding stability of the forecasted protein variants. In this revised version, where we included analyses of additional real datasets, we found that the accuracy of sequence prediction can vary among datasets. Notably, the analysis of an influenza NS1 protein dataset, with higher diversity than the other datasets studied, showed that the SCS model was more accurate than the neutral model in predicting sequences across different time points. Datasets with relatively high sequence diversity can contain more evolutionary information, which can improve prediction quality. Still, we believe that further efforts are required in the field in improving the accuracy of substitution models of molecular evolution. Altogether, accurately forecasting the folding stability of future real proteins is fundamental for predicting their protein function and enabling a variety of applications. Also, we implemented the models into a freely available computer framework, with detailed documentation and a variety of practical examples.

      Strengths: 

      The results demonstrate that fitness constraints based on protein stability can prevent the emergence of unrealistic, destabilized variants - a limitation of traditional, neutral substitution models. In particular, the predicted folding stabilities of simulated protein variants closely match those observed in real variants, suggesting that the model captures relevant biophysical constraints. 

      We agree with the reviewer and appreciate the consideration that forecasting the folding stability of future real proteins is a relevant finding. For instance, folding stability is fundamental for protein function and affects several other molecular properties.

      Weaknesses: 

      The predictive scope of the method remains limited. While the model effectively preserves folding stability, its ability to forecast specific sequence content is not well supported. 

      Our study showed a good accuracy in predicting the real folding stability of forecasted protein variants under a selection model, but not under a neutral model. Predicting the exact sequences was more challenging, which was not surprising considering previous studies. In particular, inferring specific sequences is considerably challenging even for ancestral molecular reconstruction (Arenas, et al. 2017; Arenas and Bastolla 2020). Indeed, observed sequence diversity is much greater than observed structural diversity (Illergard, et al. 2009; Pascual-Garcia, et al. 2010), and substitutions between amino acids with similar physicochemical properties can yield modeled protein variants with more accurate folding stability, even when the exact amino acid sequences differ. As indicated, further work is demanded in the field of substitution models of molecular evolution. Next, in this revised version, where we included analyses of additional real datasets, we found that the accuracy of sequence prediction can vary among datasets. Notably, the analysis of an influenza NS1 protein dataset, with higher diversity than the other datasets studied, showed that the SCS model was more accurate than the neutral model in predicting sequences across different time points. Datasets with relatively high sequence diversity can contain more evolutionary information, which can improve prediction quality. In any case, as previously indicated, we believe that efforts are required in the field of substitution models of molecular evolution. Apart from that, forecasting the folding stability of future real proteins is an important advance in forecasting protein evolution, given the essential role of folding stability in protein function (Scheiblhofer, et al. 2017; Bloom and Neher 2023) and its variety of applications. We have now expanded these aspects in the manuscript.

      Only one dataset (HIV-1 MA) is evaluated for sequence-level divergence using KL divergence; this analysis is absent for the other proteins. The authors use a consensus Omicron sequence as a representative endpoint for SARS-CoV-2, which overlooks the rich longitudinal sequence data available from GISAID. The use of just one consensus from a single time point is not fully justified, given the extensive temporal and geographical sampling available. Extending the analysis to include multiple timepoints, particularly for SARS-CoV-2, would strengthen the predictive claims. Similarly, applying the model to other well-sampled viral proteins, such as those from influenza or RSV, would broaden its relevance and test its generalizability. 

      The evaluation of forecasting evolution using real datasets is complex due to several conceptual and practical aspects. In contrast to traditional phylogenetic reconstruction of past evolutionary events and ancestral sequences, forecasting evolution often begins with a variant that is evolved forward in time and requires a rough fitness landscape to select among possible future variants (Lässig, et al. 2017). Another concern for validating the method is the need to know the initial variant that gives rise to the corresponding future (forecasted) variants, and it is not always known. Thus, we investigated systems where the initial variant, or a close approximation, is known, such as scenarios of in vitro monitored evolution. In the case of SARS-CoV-2, the Wuhan variant is commonly used as the starting variant of the pandemic. Next, since forecasting evolution is highly dependent on the used model of evolution, unexpected external factors can be dramatic for the predictions. For this reason, systems with minimal external influences provide a more controlled context for evaluating forecasting evolution. For instance, scenarios of in vitro monitored virus evolution avoid some external factors such as host immune responses. Another important aspect is the availability of data at two (i.e., present and future) or more time points along the evolutionary trajectory, with sufficient genetic diversity between them to identify clear evolutionary signatures. Additionally, using consensus sequences can help mitigate effects from unfixed mutations, which should not be modeled by a substitution model of evolution. Altogether, not all datasets are appropriate to properly evaluate or apply forecasting evolution. These aspects are indicated in the manuscript. Sequence comparisons based on the KL divergence require, at the studied time point, an observed distribution of amino acid frequencies among sites and an estimated distribution of amino acid frequencies among sites. In the study datasets, this is only the case for the HIV-1 MA dataset, which belongs to a previous study from one of us and collaborators where we obtained at least 20 independent sequences at each sampling point (Arenas, et al. 2016). This aspect is now more clearly indicated in the manuscript. Regarding the Omicron datasets, we used 384 curated sequences of the Omicron variant of concern to construct the study data and we believe that it is a representative sample. The sequence used for the initial time point was the Wuhan variant (Wu, et al. 2020), which is commonly assumed to be the origin of the pandemic in SARS-CoV-2 studies. As previously indicated, the use of consensus sequences is convenient to avoid variants with unfixed mutations. Regarding extending the analysis to other time points (other variants of concern), we kindly disagree because Omicron is the variant of concern with the highest genetic distance to the Wuhan variant, and a high genetic distance is required to properly evaluate the prediction method. Actually, we noted that earlier variants of concern show a small number of fixed mutations in the study proteins, despite the availability of large numbers of sequences in databases such as GISAID. Additionally, we investigated the evolutionary trajectories of HIV-1 protease (PR) in 12 intra-host viral populations with predictions for up to four different time points. Apart from those aspects, following the proposal of the reviewer, we have now incorporated the analysis of an additional dataset of influenza NS1 protein (Bao, et al. 2008), with predictions for two different time points, to further assess the generalizability of the method. We have now included details of this influenza NS1 protein dataset and the predictions derived from it in the manuscript.

      It would also be informative to include a retrospective analysis of the evolution of protein stability along known historical trajectories. This would allow the authors to assess whether folding stability is indeed preserved in real-world evolution, as assumed in their model.

      Our present study does not aim to investigate the evolution of the folding stability over time, although it provides this information indirectly at the studied time points. Instead, the present study shows that the folding stability of the forecasted protein variants is similar to the folding stability of the corresponding real protein variants for diverse viral proteins, which provides an important evaluation of the prediction method. Next, the folding stability can indeed vary over time in both real and modeled evolutionary scenarios, and our present study is not in conflict with this. In that regard, which is not the aim of our present study, some previous phylogenetic-based studies have reported temporal fluctuations in folding stability for diverse protein data (Arenas, et al. 2017; Olabode, et al. 2017; Arenas and Bastolla 2020; Ferreiro, et al. 2022).

      Finally, a discussion on the impact of structural templates - and whether the fixed template remains valid across divergent sequences - would be valuable. Addressing the possibility of structural remodeling or template switching during evolution would improve confidence in the model's applicability to more divergent evolutionary scenarios.

      This is an important point. For the datasets that required homology modeling (in several cases it was not necessary because the sequence was present in a protein structure of the PDB), the structural templates were selected using SWISS-MODEL, and we applied the best-fitting template. We have now included in a supplementary table details about the fitting of the structural templates. Indeed, our proposal assumes that the protein structure is maintained over the studied evolutionary time, which can be generally reasonable for short timescales where the structure is conserved (Illergard, et al. 2009; Pascual-Garcia, et al. 2010). Over longer evolutionary timescales, structural changes may occur and, in such cases, modeling the evolution of the protein structure would be necessary. To our knowledge, modeling the evolution of the protein structure remains a challenging task that requires substantial methodological developments. Recent advances in artificial intelligence, particularly in protein structure prediction from sequence, may offer promising tools for addressing this challenge. However, we believe that evaluating such approaches in the context of structural evolution would be difficult, especially given the limited availability of real data with known evolutionary trajectories involving structural change. In any case, this is probably an important direction for future research. We have now included this discussion in the manuscript.

      Reviewer #1 (Recommendations for the authors): 

      (1) Abstract: "expectedly, the errors grew up in the prediction of the corresponding sequences" <- Not entirely clear what is meant by "errors grew up" or what the errors grew with.

      This sentence refers to the accuracy of sequence prediction in comparison to that of folding stability prediction. We have now clarified this aspect in the manuscript.

      (2) Lines 162-165: "Alternatively, if the fitness is determined based on the similarity in folding stability between the modeled variant and a real variant, the birth rate is assumed to be 1 minus the root mean square deviation (RMSD) in folding stability." <- What is the biological motivation for using the RMSD? It seems like a more stable variant would always have higher fitness, at least according to Equation 1.

      RMSD is commonly used in molecular biology to compare proteins in terms of structural distance, folding stability, kinetics, and other properties. It offers advantages such as minimizing the influence of small deviations while amplifying larger differences, thereby enhancing the detection of remarkable molecular changes. Additionally, RMSD would facilitate the incorporation of other biophysical parameters, such as structural divergences from a wild-type variant or entropy, which could be informative for fitness in future versions of the method. We have now included this consideration in the manuscript.

      (3) Lines 165-166: "In both cases, the death rate (d) is considered as 1-b to allow a constant global (birth-death) rate" <- This would give a constant R = b / (1-b) over the entire phylogenetic tree. For applications to pathogens like viruses with epidemic dynamics, this is extremely implausible. Is there any need to make such a restrictive assumption? 

      Regarding technical considerations of the model, we refer to our answer to the first public review comment. Next, a constant global rate of evolution was observed in numerous genes and proteins of diverse organisms, including viruses (Gojobori, et al.1990; Leitner and Albert 1999; Shankarappa, et al. 1999; Liu, et al. 2004; Lu, et al. 2018; Zhou, et al. 2019). However, following the comment of the reviewer, and as we indicated in our answer to the first public review comment, we have now implemented and evaluated an additional birth-death model that allows for variation in the global birth-death rate among lineages. We have implemented this additional model in the framework and described it along with its results in the manuscript.

      (4) Lines 187-188: "As a consequence, since b+d=1 at each node, tn is consistent across all nodes, according to (Harmon, 2019)." <- This would also imply that all lineages have a growth rate r = b - d, which under a birth-death model is equivalent to saying all lineages have the same fitness! 

      We clarified this aspect in our answer to the first public review comment. In particular, in the model presented, protein variants with higher fitness have higher birth rates, leading to more birth events, while protein variants with lower fitness have lower birth rates leading to more extinction events, which presents biological meaning for the study system. In our model b and d can vary among lineages according to the corresponding fitness (i.e., a lineage may have b=0.9, d=0.1, r=0.8; while another one may have b=0.6, d=0.4, r=0.2). Since the reproductive success varies among lineages in our model, the statement “this is essentially assuming all lineages have the same absolute fitness” is incorrect, although it could be interpreted like that in certain models. Fitness affects reproductive success, but fitness and growth rate of evolution are different biological processes (despite a faster growth rate can sometimes be associated with higher fitness, a variant with a high fitness not necessarily has to accumulate substitutions at a higher rate). An example in molecular adaptation studies is the traditional nonsynonymous to synonymous substitution rates ratio (dN/dS), where dN/dS (that informs about selection derived from fitness) can be constant at different rates of evolution (dN and dS). In any case, we thank the reviewer for raising this point, which led us to incorporate an additional birth-death model and inspired some ideas.  Thus, following the comment of the reviewer and as indicated in the answer to the first public review comment, we have now implemented and evaluated an additional birthdeath model that allows for variation in the global birth-death rate among lineages. The results indicated that this model yields similar predictive accuracy compared to the previous birth-death model. We have now included these aspects, along with the results from the additional model, in the manuscript.

      (5) Line 321-322: "For the case of neutral evolution, all protein variants equally fit and are allowed, leading to only birth events," <- Why would there only be birth events? Lineages can die regardless of their fitness. 

      AU: In the neutral evolution model, all protein variants have the same fitness, resulting in a flat fitness landscape. Since variants are observed, we allowed birth events. However, it assumed the absence of death events as no information independent of fitness is available to support their inclusion and quantification, thereby avoiding the imposition of arbitrary death events based on an arbitrary death rate. We have now provided a justification of this assumption in the manuscript.

      Reviewer #2 (Recommendations for the authors): 

      (1) Clarify the purpose of the alternative fitness mode ("ΔG similarity to a target variant"): 

      The manuscript briefly introduces an alternative fitness function based on the similarity of a simulated protein's folding stability to that of a real protein variant, but does not provide a clear motivation, usage scenario, or results derived from it. 

      The presented model provides two approaches for deriving fitness from predicted folding stability. The simpler approach assumes that a more stable protein variant has higher fitness than a less stable one. The alternative approach assigns high fitness to protein variants whose stability closely matches observed stability, acknowledging that the real observed stability is derived from the real selection process, and this approach considers negative design by contrasting the prediction with real information. For the analyses of real data in this study, we used the second approach, guided by these considerations. We have now clarified this aspect in the manuscript.

      (2) Report structural template quality and modeling confidence: 

      Since folding stability (ΔG) estimates rely on structural models derived from homology templates, the accuracy of these predictions will be sensitive to the choice and quality of the template structure. I recommend that the authors report, for each protein modeled, the template's sequence identity, coverage, and modeling quality scores. This will help readers assess the confidence in the ΔG estimates and interpret how template quality might impact simulation outcomes. 

      We agree with the reviewer and we have now included additional information in a supplementary table regarding sequence identity, modeling quality and coverage of the structural templates for the proteins that required homology modeling. The selection of templates was performed using the well-established framework SWISS-MODEL and the best-fitting template was chosen. Next, a large number of protein structures are available in the PDB for the study proteins, which favors the accuracy of the homology modeling. For some datasets, homology modeling was not required, as the modeled sequence was already present in an available protein structure. We have now included this information in the manuscript and in a supplementary table.

      (3) Clarify whether structural remodeling occurs during simulation: 

      It appears that folding stability (ΔG) for all simulated protein variants is computed by mapping them onto a single initial homology model, without remodeling the structure as sequences evolve. If correct, this should be clearly stated, as it assumes that the structural fold remains valid across all simulated variants. A discussion on the potential impact of structural drift would be welcome.

      We agree with the reviewer. As indicated in our answer to a previous comment, our method assumes that the protein structure is maintained over the studied evolutionary time, which is generally acceptable for short timescales where the structure is conserved (Illergard, et al. 2009; Pascual-Garcia, et al. 2010). At longer timescales the protein structure could change, requiring the modeling of structural evolution over the evolutionary time. To our knowledge, modeling the evolution of the protein structure remains a challenging task that requires substantial methodological developments. Recent advances in artificial intelligence, particularly in protein structure prediction from sequence, can be promising tools for addressing this challenge. However, we believe that evaluating such approaches in the context of structural evolution would be difficult, especially given the limited availability of real datasets with known evolutionary trajectories involving structural change. In any case, this is probably an important direction for future research. We have now included this discussion in the manuscript.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors develop a complete integral drive system in Anopheles gambiae malaria mosquitoes. This type of gene drive is interesting, with special advantages and disadvantages compared to more common designs. Here, the authors develop the Cas9 element and combine it with a previously developed antimalaria effector element. The new element performs very well in terms of drive efficiency, but it has unintended fitness costs, and a higher than desirable rate of functional resistance allele formation. Nevertheless, this study represents a very good step forward toward developing effective gene drives and is thus of high impact.

      The format of the manuscript is a bit suboptimal for review. Please add line numbers next time for easy reference. It would also help to have spaces between paragraphs and to have figures (with legends) added to the text where they first appear.

      It might be useful to add subsections to the results, just like in the methods section. It could even be expanded a bit with some specific parts from the discussion, through this is optional.

      Abstract: The text says: "As a minimal genetic modification, nanosd does not induce widespread transcriptomic perturbations." However, it does seem to change things based on Figure 3c.

      Page 2: "drive technologies for public health and pest control applications" needs a period afterward.

      Page 2: "The fitness costs, homing efficiency, and resistance rate of the gene drive is" should be "The fitness costs, homing efficiency, and resistance rate of the gene drive are".

      Page 2: "When they target important mosquito genes, gene drives are designed to ensure that the nuclease activity window (germline) does not overlap with that of the target gene (somatic)." is note quite correct. This is, of course, sensible for suppression drives, but it's not a necessary requirement for modification drives with rescue elements in many situations.

      Page 2: "recessive somatic fitness cost phenotypes" is unclear. I think that you are trying to avoid the recessive fitness cost of null alleles becoming a dominant fitness cost from a gene drive allele (in drive-wild-type heterozygotes).

      Page 2: "This optimization approach has had only limited success, and suboptimal performance is commonly attributed to not capturing all the regulatory elements specific to the germline gene's expression9,12". I don't think this is correct. There are several examples where a new promoter helped a lot. The zpg promoter in Anopheles gambiae allowed success at the dsx site in suppression cage studies (Kyrou et al 2018), and nanos gave big improvement to modification drives at the cardinal locus (Carballer et al 2020). In flies, several promoters were tested, and one allowed success in cage experiments (Du et al 2024). In Aedes, the shu promoter allowed for high drive performance (Anderson et al 2023), though this last one hasn't been tested in more difficult situations. I think you could certainly argue in the general case that not all promoters will work the way their transcriptome says, but there are many examples where they seem to be pretty good.

      Page 2: "make it more likely that mutations that disrupt the drive components are selected against though loss of function of the host gene." I think that this needs a bit more explanation. You are referring to mutations in regulatory elements or frameshift mutations. This will make it more resistant to mutation. Also, these mutations would tend to have a minor effect expect perhaps in the cargo gene of a modification drive. By using a cargo gene in an integral drive, you could still keep it somewhat safer, but whether this is 1.2x or 10x safer is unclear.

      Page 3: "they can incur severe unintended fitness costs". This is central to integral drives and this manuscript. It's worth elaborating on.

      Page 3: "Regulatory elements from germline genes that have worked sub-optimally in traditional gene drive designs for the reasons outlined above may work well in an IDG design20." This is setting up the integral drive with nanos, but nanos DOES work well in traditional Anopheles gambiae gene drive designs. It is possible that you might end up with less somatic expression than Hammond et al 2020 (though the comparison is unclear due to batch effects in that study), but there is no direct comparison in this manuscript to that.

      Page 3: "This suggests an impact of maternal deposition on drive efficiency only in female drive carriers." This is quite strange. Usually, I would expect to see an equal reduction in efficiency between male and female progeny. Could this be due to limited sample size? Random idea: It's also possible that almost all maternal deposition was mosaic and wouldn't be enough to direct affect drive conversion. However, it could cause enough of a fitness cost TOGETHER with new drive expression in females that perhaps only tissues with randomly low expression rates properly developed and led to progeny, reducing drive inheritance? Another possibility: Could the drive/resistance males have impaired fertility, so these ones are underrepresented in the batch cross? If nanos is needed in males and a single drive copy is not quite enough for good fertility or mating competitiveness, they may be underrepresented in your crosses. They might have worse fertility than drive homozygous males, which at least have two partially working copies of nanos rather than just one (in many cells, at least). Maybe check the testis for abnormal phenotypes?

      Overall, it would be favorable if the drive allele was somewhere more fit than a nonfunctional resistance allele. This could already be achieved in this drive, but it doesn't get much mention.

      Page 3: There should be a comma after, "Interestingly, while many of the observed mutations were predicted to abolish nanos expression" and "This could indicate that in these experiments".

      Page 3 last sentence: Please improve the clarity.

      Removing the EGFP is supposed to restore the fitness, and this was helpful in some previous integral drive constructs. This could get a bit more mention (it is possible that I missed this somewhere in the manuscript).

      Page 4: The MM-CP line and it's association with the integral drive strategy could get a little more introduction. Maybe even a supplemental figure showing the general idea.

      Page 5: "cassette is predicted to disrupt the CP function entirely (Fig. 5d)" also lacks a period.

      Page 5: "The subsequent stabilization of the nanosd frequency and the lack of rapid loss suggests that any associated fitness cost is primarily recessive." This is not quite correct because by this point, drive/wild-type heterozygotes are rare, and this is where you'd find a potential dominant fitness cost. It should be correct in the end stages where it is a mix of drive and functional/nonfunctional resistance alleles (though the nonfunctional resistance alleles may cause greater fitness costs when together with a drive - see above).

      Page 6: "Maternal deposition of Cas9, or Cas9;gRNA, into the zygote can lead to cutting at stages when homing is not favoured, and has been commonly observed for canonical Anopheles nanos drives9,10,35." Reference 35 (which is more suitable for referencing an example of nanos in other Anopheles) found some resistance alleles by deep sequencing, but the timing that they formed was unclear (it's not certain if it was maternal deposition). This study may be a more suitable reference: Carballar-Lejarazú R, Tushar T, Pham TB, James AA. Cas9-mediated maternal-effect and derived resistance alleles in a gene-drive strain of the African malaria vector mosquito, Anopheles gambiae. Genetics, 2022.

      Page 8: "could further reduce the likelihood of resistance allele formation by increasing the frequency of HDR events." Multiple gRNAs would mostly help by reducing functional resistance allele formation, especially since drive conversion is already very high in Anopheles.

      Page 8, last paragraph: This conclusion is perhaps a little optimistic considering the functional resistance alleles, which should get a little more attention in the summary or elsewhere in the discussion section.

      Figure 1d: The vertical text saying "Non-WT" is confusing. The circles themselves show + and -. Also, "-" isn't necessarily a knockout allele, so I'm not sure if - is the best symbol for resistance.

      Figure 2e: The vertical scale is not the most intuitive. Consider rearranging it to "Transition from larvae to pupae" starting at zero and going to 1 when all the larvae become pupae.

      Figure 2e-f: For both of these, there are clear differences between males and females. Thus, when comparing drive homozygotes to wild-type, it would probably be better to have separate statistical comparisons for males and females.

      Figure 3: Can any of these transcription results in individual genes potentially explain the observed fitness cost?

      Figure 3b: The scale here also doesn't quite make sense. It's the fraction of underdeveloped ovaries, but the graph is also perhaps trying to show whether just 1-2 ovaries are present, or maybe how many ovaries are undeveloped, but then it would say "zero"? This should be clarified. Number of ovaries and how well-developed they are is separate (it can be put on the same graph, but needs to be more clear).

      Figure 4f: The vertical axis should say "ONNV."

      Figure 5c-d: These should be labeled as the most common resistance allele. Also, I'm not sure how relevant it is, but we also found an alternate start codon here: Hou S, Chen J, Feng R, Xu X, Liang N, Champer J. A homing rescue gene drive with multiplexed gRNAs reaches high frequency in cage populations but generates functional resistance. J Genet Genomics, 2024. Maybe this is a more common problem than one would expect?

      Figure 5cd,S4,S5: They have a bit of a weird plot. Why not make four line graphs for each? Also, some alleles use the  symbol. + is wild-type, which is well understood, but - as resistance is not always clear, and seeing them together may confuse readers. Additionally, the fact that you have the most common resistance allele in Figure 5cd might mean that you know more about the genotype? If so, it would be best to separate wild-type and resistance alleles in whatever the final figure looks like.

      Some supplemental raw data files would be useful if they were available, but the figures are through enough that this isn't essential.

      Review by:

      Jackson Champer, with major assistance from Ruobing Feng (essentially section B) and Jie Du

      Referee cross-commenting

      We don't have any cross-comments, other than supporting the idea of slightly more comparisons to the authors' previous construct.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      A key innovation of the nanosd gene drive is its integral gene drive (IGD) design, which inserts the drive cassette directly into the A. gambiae nanos gene, incorporating only the minimal components necessary for drive function. The drive achieves high transmission rates, without causing widespread disruption of gene expression or increasing susceptibility to malaria parasites, and imposes an acceptable fitness cost-primarily a reduction in female fecundity when homozygous. The strong performance of nanosd can be attributed to its design: Cas9 is expressed in the correct cells and timing to induce efficient homing, effectively hijacking the nanos gene's natural expression profile. However, despite the careful design aimed at preserving nanos function, the rescue was incomplete: homozygous female drive carriers exhibited a clear reduction in ovarian function.

      In caged population trials, both the drive and a co-introduced anti-malaria effector gene reached high frequencies, even in the presence of emerging resistance alleles. Because the drive is inserted into an essential gene, nonfunctional resistance alleles are selected against and tend to be purged over time. Nonetheless, functional resistance remains a concern. The use of a single, though precisely positioned gRNA targeting the native nanos gene ATG site increases the likelihood of generating functional resistance alleles. Over the long term, if the drive imposes fitness costs, it may be outcompeted by such functional resistance alleles, potentially undermining the goal of sustained population modification.

      Overall, this study represent a notable advance in Anopheles mosquito gene drive development and can be considered as high impact. - Place the work in the context of the existing literature (provide references, where appropriate).

      Previous IGD efforts in Drosophila, mice and mosquitoes have demonstrated nearly super‐Mendelian inheritance but often at the expense of host fitness. For example, Nash et al. built an intronic‐gRNA Cas9 drive at the D. melanogaster rcd-1r locus that propagated efficiently through cage populations (Nash et al., 2022), and Gonzalez et al. reported that a Cas9 drive inserted at the germline zpg locus in Anopheles stephensi biased inheritance by ~99.8% (Gonzalez et al., 2025). However, these strong drives disrupted essential genes: in A. gambiae, inserting Cas9 into zpg produced efficient homing but rendered females largely sterile (Ellis et al., 2022). A similar germline Cas9 knock-in in Mus musculus enabled gene conversion in both sexes, albeit with only modest efficiency and potential fitness trade-offs (Weitzel et al., 2021). The current nanosd IGD is explicitly designed to overcome this limitation by selecting a more permissive gene target and using a minimal drive cassette, so as to preserve mosquito viability while maintaining robust drive efficiency, although still with reduced female drive homozygotes fertility.

      This nanosd gene drive like previous homing drives in Anopheles, is capable of achieving a high level of inheritance bias. Although it uses the endogenous nanos regulatory elements, which have less leaky somatic expression compared to using vasa (Gantz et al., 2015; Hammond et al., 2016; Hammond et al., 2017) or zpg promoters(Hammond et al., 2021; Kyrou et al., 2018), to drive Cas9 expression and thereby reduces somatic expression-induced female sterility, the incomplete rescue of nanos function still leads to reduced female fertility in drive homozygotes. - State what audience might be interested in and influenced by the reported findings.

      It's worth noting the broad audience that will find this work relevant. Gene drive developers and molecular geneticists will be impressed by the good drive performance and directly influenced by the design principles showcased here. The study's integral gene drive architecture that leverages the endogenous nanos regulatory elements, in-frame E2A peptide linkage for co-expression, and intronic insertion of gRNA and selectable markers addresses long-standing challenges in promoter leakage, somatic fitness costs, and resistance allele evolution. What's more, vector biologists and malaria researchers will be interested in the successful deployment of a gene drive in A. gambiae that actually carries a disease-blocking trait. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      We have worked on CRISPR gene drive development in both fruit flies and Anopheles mosquitoes and have experience with modeling their spread.

      References

      Ellis, D.A., Avraam, G., Hoermann, A., Wyer, C.A.S., Ong, Y.X., Christophides, G.K., and Windbichler, N. (2022). Testing non-autonomous antimalarial gene drive effectors using self-eliminating drivers in the African mosquito vector Anopheles gambiae. PLOS Genetics 18, e1010244-e1010244.

      Gantz, V.M., Jasinskiene, N., Tatarenkova, O., Fazekas, A., Macias, V.M., Bier, E., and James, A.A. (2015). Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proc Natl Acad Sci U S A 112, E6736-E6743.

      Gonzalez, E., Anderson, M.A.E., Ang, J.X.D., Nevard, K., Shackleford, L., Larrosa-Godall, M., Leftwich, P.T., and Alphey, L. (2025). Optimization of SgRNA expression with RNA pol III regulatory elements in Anopheles stephensi. Scientific Reports 15, 13408.

      Hammond, A., Galizi, R., Kyrou, K., Simoni, A., Siniscalchi, C., Katsanos, D., Gribble, M., Baker, D., Marois, E., Russell, S., et al. (2016). A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae. Nat Biotechnol 34, 78-83.

      Hammond, A., Karlsson, X., Morianou, I., Kyrou, K., Beaghton, A., Gribble, M., Kranjc, N., Galizi, R., Burt, A., Crisanti, A., et al. (2021). Regulating the expression of gene drives is key to increasing their invasive potential and the mitigation of resistance. PLOS Genetics 17, e1009321-e1009321.

      Hammond, A.M., Kyrou, K., Bruttini, M., North, A., Galizi, R., Karlsson, X., Kranjc, N., Carpi, F.M., D'Aurizio, R., Crisanti, A., et al. (2017). The creation and selection of mutations resistant to a gene drive over multiple generations in the malaria mosquito. PLOS Genetics 13, e1007039-e1007039.

      Kyrou, K., Hammond, A.M., Galizi, R., Kranjc, N., Burt, A., Beaghton, A.K., Nolan, T., and Crisanti, A. (2018). A CRISPR-Cas9 gene drive targeting doublesex causes complete population suppression in caged Anopheles gambiae mosquitoes. Nature Biotechnology 36, 1062-1066.

      Nash, A., Capriotti, P., Hoermann, A., Papathanos, P.A., and Windbichler, N. (2022). Intronic gRNAs for the construction of minimal gene drive systems. Frontiers in Bioengineering and Biotechnology 0, 570-570. Weitzel, A.J., Grunwald, H.A., Ceri, W., Levina, R., Gantz, V.M., Hedrick, S.M., Bier, E., and Cooper, K.L. (2021). Meiotic Cas9 expression mediates gene conversion in the male and female mouse germline. Plos Biol 19, e3001478-e3001478.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Lysosomal damage is commonly found in many diseases including normal aging and age-related disease. However, the transcriptional programs activated by lysosomal damage have not been thoroughly characterized. This study aimed to investigate lysosome damage-induced major transcriptional responses and the underlying signaling basis. The authors have convincingly shown that lysosomal damage activates a ubiquitination-dependent signaling axis involving TAB, TAK1, and IKK, which culminates in the activation of NF-kB and subsequent transcriptional upregulation of pro-inflammatory genes and pro-survival genes. Overall, the major aims of this study were successfully achieved.

      Strengths:

      This study is well-conceived and strictly executed, leading to clear and well-supported conclusions. Through unbiased transcriptomics and proteomics screens, the authors identified NF-kB as a major transcriptional program activated upon lysosome damage. TAK1 activation by lysosome damage-induced ubiquitination was found to be essential for NF-kB activation and MAP kinase signaling. The transcriptional and proteomic changes were shown to be largely driven by TAK1 signaling. Finally, the TAK1-IKK signaling was shown to provide resistance to apoptosis during lysosomal damage response. The main signaling axis of this pathway was convincingly demonstrated.

      Weaknesses:

      One weakness was the claim of K63-linked ubiquitination in lysosomal damage-induced NF-kB activation. While it was clear that K63 ubiquitin chains were present on damaged lysosomes, no evidence was shown in the current study to demonstrate the specific requirement of K63 ubiquitin chains in the signaling axis being studied. Clarifying the roles of K63-linked versus other types of ubiquitin chains in lysosomal damage-induced NF-kB activation may improve the mechanistic insights and overall impact of this study.

      Another weakness was that the main conclusions of this study were all dependent on an artificial lysosomal damage agent. It will be beneficial to confirm key findings in other contexts involving lysosomal damage.

      We would like to thank Reviewer #1 for the positive and constructive comments on our study. For a main concern regarding the molecular mechanism by which TAB proteins are activated in response to lysosomal damage, we have added the experimental results to support that the lysosomal accumulation of K63 ubiquitin chains serves as a trigger to activate the TAB-TAK1 pathway. We also investigated and discussed the role of LUBAC-mediated M1 ubiquitin chains in NF-kB activation and the effects of other lysosomal-damaging compounds. Please see the response to “Reviewer #3 (Public review): Suggestions:”.

      Reviewer #2 (Public review):

      Summary:

      Endo et al. investigate the novel role of ubiquitin response upon lysosomal damage in activating cellular signaling for cell survival. The authors provide a comprehensive transcriptome and proteome analysis of aging-related cells experiencing lysosomal damage, identifying transcription factors involved in transcriptome and proteome remodeling with a focus on the NF-κB signaling pathway. They further characterized the K63-ubiquitin-TAB-TAK1-NF-κB signaling axis in controlling gene expression, inflammatory responses, and apoptotic processes.

      Strengths:

      In the aging-related model, the authors provide a comprehensive transcriptome and characterize the K63-ubiquitin-TAB-TAK1-NF-κB signaling axis. Through compelling experiments and advanced tools, they elucidate its critical role in controlling gene expression, inflammatory responses, and apoptotic processes.

      Weaknesses:

      The study lacks deeper connections with previous research, particularly:

      • The established role of TAB-TAK1 in AMPK activation during lysosomal damage

      • The potential significance of TBK1 in NF-κB signaling pathways

      We would like to thank Reviewer #2 for the helpful comments on our study. To achieve a more comprehensive understanding of the signaling pathways involved in the lysosomal damage response, we investigated additional related signal mediators, such as TBK1 and LUBAC. The citations related to AMPK have been incorporated.

      Reviewer #3 (Public review):

      Summary:

      The response to lysosomal damage is a fast-moving and timely field. Besides repair and degradation pathways, increasing interest has been focusing on damaged-induced signaling. The authors conducted both transcriptomics and proteomics to characterize the cellular response to lysosomal damage. They identify a signaling pathway leading to activation of NFkappaB. Based on this and supported by Western blot and microscopy data, the authors nicely show that TAB2/3 and TAK1 are activated at damaged lysosomes and kick off the pathway to alter gene expression, which induces cytokines and protect from cell death. TAB2/3 activation is proposed to occur through K63 ubiquitin chain formation. Generally, this is a careful and well conducted study that nicely delineates the pathway under lysosomal stress. The "omics" data serves as a valuable resource for the field. More work should be invested into how TAB2/3 are activated at the damaged lysosomes, also to increase novelty in light of previous reports.

      Strengths:

      Generally, this is a careful and well-conducted study that nicely delineates the pathway under lysosomal stress. The "omics" data serves as a valuable resource for the field.

      Weaknesses:

      More work should be invested into how TAB2/3 are activated at the damaged lysosomes, also to increase novelty in light of previous reports. Moreover, different damage types should be tested to probe relevance for different pathophysiological conditions.

      We would like to thank Reviewer #3 for the valuable comments on our study. We have added the experimental results to address two concerns raised by Reviewer #3. Please see the response to “Reviewer #3 (Public review): Suggestions:”.

      Suggestions:

      (1) A recent paper claims that NFkappaB is activated by Otulin/M1 chains upon lysosome damage through TBK1 (PMID: 39744815). In contrast, Endo et al. nicely show that ubiquitylation is needed (shown by TAK-243) for NFkB activation but only have correlative data to link it specifically to K63 chains. On page 15, line 11, the authors even argue a "potential" involvement of K63. This point should be better dealt with. Can the authors specifically block K63 formation? K63R overexpression or swapping would be one way. Is the K63 ligase ITCH involved (PMID: 38503285) or any other NEDD4-like ligase? This could be compared to LUBAC inhibition. Also, the point needs to be dealt with more controversially in the discussion as these are alternative claims (M1 vs K63, TAB vs TBK1).

      It is well-characterized that the NZF domain of TAB proteins preferentially associates with K63-linked ubiquitin chains. Therefore, we performed the add-back experiment using siRNA-resistant TAB2 WT and mutants incapable of binding to K63-linked ubiquitin chains, dNZF and E685A, to elucidate the requirement of K63 ubiquitin chains for TAK1 activation. We investigated whether the add-back of TAB2 mutants rescues the activation of TAK1 in TAB2-depleted cells (Fig. 2E). TAB2 WT, but not dNZF and E685A, rescued TAK1 activation in response to LLOMe, suggesting that the specific interaction of TAB proteins and K63 ubiquitin chains is a key mechanism to activate TAK1. We also found that the treatment of an E1 inhibitor TAK-243 effectively prevented the lysosomal accumulation of K63 ubiquitin chains, but TAB2 was recruited to damaged lysosomes (Fig. S2B). This suggests that the recruitment of TAB proteins to damaged lysosomes is independent of the association with K63 ubiquitin chains. Collectively, it is postulated that TAB proteins require interaction with K63 ubiquitin chains for TAK1 activation, but not for recruitment to damaged lysosomes. We have added the sentences (p9, lines 7-20, and p10, lines 8-10).

      Next, we confirmed that LUBAC functions are essential for NF-kB activation in the lysosomal damage response. RNF31/HOIP is a component of LUBAC that catalyzes M1 ubiquitination. The depletion of RNF31 showed no significant effects on TAK1 activation, but abolished IKK activation (Fig. S4G). It is well-characterized that LUBAC-mediated M1 ubiquitin chains recruit IKK subunits and transduce the signaling to downstream in the canonical pathway. We assume that K63 ubiquitin chains in damaged lysosomes initially activate TAB-TAK1 and trigger LUBAC-mediated M1 ubiquitination, and subsequently, M1 ubiquitination functions to recruit the IKK complex. Consequently, activated TAK1 phosphorylates IKK subunits in damaged lysosomes, leading to NF-kB activation. We also examined whether TBK1 is involved in the activation of NF-kB. TBK1 was phosphorylated upon LLOMe, and depletion of TAB and TAK1 resulted in a slight reduction of TBK1 phosphorylation (Fig. S4D, E). The treatment of a TBK1 inhibitor BX-795 exhibited no or little effects on TAK1 activation, but abolished phosphorylation of IKK and IkBa (Fig. S4F). These suggest that TBK1 is required for the activation of NF-kB. We have added the sentences (p13, line 13-p14, line 10).

      As mentioned by Reviewer #3, it is important to identify the E3 ligase responsible for K63 ubiquitination in the lysosomal damage response. We have been aiming to identify such E3 ligase(s). However, depletions of ITCH and other E3 ligases that have been tested exhibited no or little effects on K63 ubiquitination and TAK1 activation.  We would like to explore E3 ligase(s) in future study.

      (2) It would be interesting to know what the trigger is that induces the pathway. Lipid perturbation by LLOMe is a good model, but does activation also occur with GPN (osmotic swelling) or lipid peroxidation (oxidative stress) that may be more broadly relevant in a pathophysiological way? Moreover, what damage threshold is needed? Does loss of protons suffice? Can activation be induced with a Ca2+ agonist in the absence of damage?

      To further clarify the initial trigger that induces TAB-TAK1 activation coupled with lysosomal damage, we examined other damage sources, GPN and DC661, which induce hyperosmotic stress and lipid peroxidation in lysosomes, respectively, thereby resulting in lysosomal membrane damage. Under our experimental conditions, the treatment of these compounds did not result in significant accumulation of Gal-3, indicating a reduced level of lysosomal membrane permeabilization compared with LLOMe (Fig. S2C, D), and no or little TAK1 activation was observed (Fig. S2E). TAB proteins require their association with K63 ubiquitin chains for TAK1 activation. It is therefore postulated that the severe lysosomal membrane permeabilization that triggers the formation and cytosolic exposure of K63 ubiquitin chains may be a determinant of TAB-TAK1 activation. In our future work, we would like to examine broad stimulation of lysosomal damage and further elucidate the initial mechanism of TAB-TAK1 activation. We have added the sentences (p9, line 21-p10, line 7).

      (3) The authors nicely define JNK and p38 activation. This should be emphasized more, possibly also in the abstract, as it may contribute to the claim of increased survival fitness.

      We further tested whether the inhibition of JNK affects the anti-apoptotic effect (Fig. S5B). The inhibition of JNK resulted in an increase in the cleaved caspase-3. This suggests that the anti-apoptotic action in the lysosomal damage response requires JNK as well as IKK. We have added the sentences in results to emphasize the pivotal role of stress-induced MAPKs (p15, lines 7-11).

      Reviewer #1 (Recommendations for the authors):

      (1) Although the ubiquitination-TAB-TAK1-IKK axis was previously characterized in other contexts, specific evidence supporting lysosomal recruitment of these components by ubiquitination during lysosome damage would be beneficial.

      We found that the treatment of an E1 inhibitor TAK-243 abolished the lysosomal accumulation of K63 ubiquitin chains, but TAB2 and TAK1 were recruited to damaged lysosomes (Fig. S2B). This suggests that the recruitment of TAB proteins to damaged lysosomes is independent of the association with K63-linked ubiquitin chains. Next, we investigated whether the add-back of TAB2 mutants incapable of binding K63 ubiquitin chains rescues the activation of TAK1 in TAB2-depleted cells (Fig. 2E). K63 ubiquitin binding of TAB2 was essential for TAK1 activation in response to LLOMe. Taken together, it is suggested that TAB proteins require their interaction with K63 ubiquitin chains for TAK1 activation, but not for recruitment to damaged lysosomes. We have added the sentences (p9, lines 7-20, and p10, lines 8-10). Please also see the response to “Reviewer #3 (Public review): Suggestions:”.

      (2) The activation of p38 and JNK by lysosomal damage does not fit well into the main conclusions of the paper, since IKK knockdown was sufficient to block cellular resistance to apoptosis (caspase cleavage in Fig. 5f). Are p38 and JNK also important for cell survival during lysosomal damage?

      We found that the inhibition of JNK resulted in an increase in the cleaved caspase-3, suggesting that the anti-apoptotic action in the lysosomal damage response requires both IKK and JNK (Fig. S5B). We have added the sentences (p15, lines 7-11).

      (3) Cell death tests are recommended to support the conclusions related to apoptosis.

      As suggested by Reviewer #1, we performed the cell death assay using propidium iodide (PI) and confirmed that HeLa cells co-treated with LLOMe and TAK-243 or HS-276 exhibited increased cell death (Fig. 5E). This indicates a direct correlation between the degree of caspase-3 cleavage and cell death, possibly apoptosis.

      (4) Page 8, line 19-21, gal3 is not exposed upon lysosomal damage. It is recruited from the cytosol by the exposed beta-galactoside-containing glycans on lysosomal membrane proteins.

      We have corrected the corresponding sentence (p7, lines 17-20).

      (5) Carefully checking grammar throughout the text is recommended. Below are a few examples:

      a) Page 4, line 10, remove "that".

      b) "K63 ubiquitin" shall be replaced with "K63 ubiquitination" or "K63 ubiquitin chains".

      c) Page 8, line 9, "remain" should be "remains".

      We have carefully checked the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      Despite the novelty and significance of these findings in advancing the field, several technical and experimental limitations require further clarification:

      We have responded to each comment. Please see below.

      The manuscript should introduce or discuss previous research showing that TAB-TAK1 facilitates AMPK activation during lysosomal damage and TAK1's increased association with damaged lysosomes (PMID: 31995728).

      We have added the reference (PMID: 31995728) and the sentences (p17, lines 15-20).

      Figure 2A: The differential LAMP1 staining intensity between control and LLOMe-treated cells needs explanation. The weaker LAMP1 signal in control and puncta changes, especially during 5-minute LLOMe treatment, require detailed clarification

      We have added the explanation (p8, lines 17-21).

      Recent literature (PMID: 34585663) reports TBK1 activation during lysosomal damage. The authors should investigate or discuss whether TBK1 potentially contributes to NF-κB signaling in this context.

      We experimentally investigated whether TBK1 is involved in the TAB-TAK1 pathway. We confirmed that TBK1 was activated upon LLOMe (Fig. S4D). Depletions of TAB and TAK1 exhibited a modest decrease in TBK1 phosphorylation (Fig. S4E). The inhibition of TBK1 by BX-795 did not affect TAK1 activation, but abolished phosphorylation of IKK and IkBa (Fig. S4F). This suggests that TBK1 is required for NF-kB activation. We have added the reference (PMID: 34585663) and the sentences (p13, lines 13-21, p14, lines 8-10, and p18, lines 15-20).

      The introduction of lysosomal damage response lacks comprehensive mechanistic information. For example, while ESCRT is discussed, other critical mechanisms such as lipid transfer and stress granule formation in lysosomal repair should be incorporated. Moreover, mTOR and AMPK signaling pathways undergo significant changes upon lysosomal damage.

      We have added the sentences (p3, lines 16-18, and p3, line 21-p4, line 1).

      The statement "lysosomal permeabilization causes the dissociation of mTORC1 from lysosomes" should explicitly reference PMID: 29625033.

      We have added the suggested reference (PMID: 29625033, p4, line 19).

      The claim that "The elimination of damaged lysosomes through lysophagy requires a period of more than half a day" needs a specific publication citation.

      We have added the reference (PMID: 23921551) to claim the time-scale of lysosomal clearance (p4, line 21).

      Figure 1G: The label "WO after 2h" lacks explanation in the figure legend and requires detailed interpretation.

      To simplify the figures, we have deleted the label “WO after 2 h” (Fig. 1G, 3F, 5D, F-J, S4G, S5A). Instead, we have added the explanation in the figure legends (Fig. 1G).

      Reviewer #3 (Recommendations for the authors):

      (1) page 8, line 13: it is recommended to phrase colocalisation "at" damaged lysosomes rather than "in" damaged lysosomes as the resolution does not allow the claim of influx into lysosomes.

      We have corrected the word (p8, line 17).

      (2) page 11, line 22: why is "whereas" used to link two events driven by the same mechanism.

      We have corrected the word (p13, line 8).

    1. Reviewer #1 (Public review):

      Summary:

      This study by Bushey et al., focuses on two newly released red-shifted anion-Channelrhodopsins (A1ACR and HfACR, referred as Ruby-ACRs) in Drosophila. Here, the authors use a combination of electrophysiology, calcium imaging, and behavioral analyses to demonstrate the advantages of Ruby-ACRs over previous optogenetic silencers like the green-shifted GtACR1 and the blue-shifted GtACR2: higher photocurrent, faster kinetics, and operating at a light spectrum range that prevents unwanted behavioral effects in the fly. The availability of these new red-shifted silencers constitutes a great addition to the Drosophila genetic toolkit.

      Strengths:

      (1) The authors generate both UAS and LexAop RubyACR reagents and test them in a variety of preparations (electrophysiological recordings, calcium imaging, different behavioral paradigms) that cover the breadth of the fly research environment.

      (2) The optical stimulation parameters are carefully measured and characterized. Especially impressive is that they managed to titrate over both wavelength and intensity across their various assays. This provides a comprehensive dataset to the community.

      (3) Tools are made available to the community through the stock center.

      Weaknesses:

      (1) The authors could better describe their construct and choice of parameters for the chosen construct. I am specifically wondering about the following points:

      a) Why use that particular backbone (not the most commonly used one across recent literature (pJFRC7 is more common).

      b) Why do the CsChrimson and GTACR1 have a Kir sequence in it, and why did the authors not put this in the RubyACRs? I would also prefer if authors don't refer to GtACR1 as GTACR-Kir in text (e.g., in line 72); instead, they should either refer to it as GtACR1 or GtACR1-kir-mVenus (based on the full genotype mentioned in their table at the end). Same for CsChrimson-kir. From what I understand, this is just a Kir trafficking sequence and not the entire Kir sequence, which can confuse the readers.

      c) Finally, I would also encourage authors to deposit plasmids on Addgene.

      (2) Figure 2 is interesting, but it is a bit unfortunate that there is a YFP baseline in most of the samples here (except Chrimson88; this should also be mentioned). I wonder how the YFP baseline impacts this data. Could the high intensity stimulation (red light) lead to bleaching of YFP or tdTomato that reduces the baseline in the green channel? All this also makes me wonder if authors tried tagging the RubyACRs with other fluorophores or non-fluorescent tags and how that impacted their functioning. Non-YFP-tagged versions would be more useful for applications involving GCaMP imaging.

      (3) Another point for Figure 2: Since RubyACRs seem to have such a broad activation range, I wonder how much the imaging light (920nm) impacts the baseline in these experiments. If there were plots without the red light stimulation and just varying imaging light intensity, that could be useful to the research community.

      (4) Also, for Figures 2C - D, in the methods authors indicate that the stimulation light intensities were progressively increased. Could this lead to desensitization of opsin? Wouldn't randomized intensities be a better way to do this? Perhaps it should be mentioned as a caveat.

      (5) In Figure 3E the bottom middle panel Vglut-Gal4,GtACR1 shows a major increase in walking at light onset. This seems very different than all other conditions, and I could not find any discussion of this. It would help if some explanation were provided for this.

    2. Reviewer #2 (Public review):

      Summary:

      Bushey et al. investigate the feasibility of using RubyACRs, specifically A1ACR1 and HfACR1 (described previously in (Govorunova et al., 2020)) as red-shifted inhibitory opsins in Drosophila melanogaster. The study employs a wide range of techniques to demonstrate successful neuronal inhibition. Electrophysiology experiments established that HfACR1 was most effective at hyperpolarizing cells, compared to A1ACR1 and GtACR1; both RubyACRs also appeared to be more effective than GtACR1 when the latter was actuated by green light. The authors further demonstrate successful neuronal inhibition using calcium imaging. RubyACRs were also shown to be useful in in vivo behavioral setups, specifically in spontaneous locomotion, associative learning, and courtship paradigms. In the courtship assay, in particular, the authors test multiple wavelengths of light at various light intensities, thus providing a rigorous analysis of the RubyACRs' efficacy under different light conditions.

      Strengths:

      The work provides the Drosophila field with a promising new tool. Red-shifted opsins are particularly advantageous in behavioral assays as red light penetrates the cuticle better than green or blue light, and provides less visual stimulation to the fly. It is also ideal for imaging as it allows for simultaneous optogenetic stimulation and GCamp imaging. A particular strength of the paper is the direct demonstration of RubyACR's capacity to inhibit neurons via electrophysiology and calcium imaging. Furthermore, inhibition effects in the three behavioral assays are strong and convincing. Given the apparent efficacy of RubyACRs and the advantages of a red-sensitive anion channelrhodopsin, this tool has great potential.

      Weaknesses:

      This work convincingly demonstrates the efficacy and potential utility of RubyACRs in Drosophila for imaging and behavior. However, the lethality/toxicity of RubyACRs is a relevant concern that should be addressed in-depth rather than glossed over, as it may pose a major obstacle to use. Discussing this issue in the present study will also help guide potential users and will set the stage for potential future efforts to ameliorate RubyACRs as optogenetic inhibitors.

      Major concerns:

      (1) Table 1 demonstrates high lethality in the RubyACRs compared to GtACR1. For example, in the MI04979-VGlut driver, GtACR1 expression resulted in 32.9% lethality, while HfACR1 expression resulted in 98.7% lethality. This lethality presents an obstacle to the potential adoption of this tool, and should be discussed in detail, rather than in passing. The authors might like to present "% lethality" rather than "% survived", as the former is more relevant when discussing the relative yield and health of flies that can be used in experiments.

      (2) In Figure 3D, driver>opsin flies have lower locomotion during the baseline (i.e., dark) phase, compared to opsin-only controls or GtACR1 flies. For some comparisons, flies are walking around 10-fold slower. For example, in the case of VGlut-GAL4>HfACR1, test flies are walking at <1 mm/s, while "Empty" test flies are walking at ~10 mm/s. This suggests that, for these drivers, neuronal and/or network function is affected. It opens the possibility that the lethality and locomotor defects could be due to cell-autonomous toxicity. We ask the authors to provide a description of this effect in the Results and to discuss it in the Discussion. Relatedly, VGlut-GAL4>GtACR1 flies in red light exhibit a locomotion increase, but this data is not mentioned in the text. The use of differing scales for the Y-axes in these panels can be confusing when the reader is expected to compare velocity across different panels. It would be best if the y-axes were set to a single range, e.g., 0 to 12 mm/s.

      (3) Lethality in broad drivers could result from cell-autonomous toxicity or neuronal dysfunction resulting from RubyACR expression. Ideally, the authors would address or even investigate the possible mechanisms of toxicity of the RubyACRs. Do cells and/or synapses expressing RubyACRs have normal morphology and function? For example, the authors could compare cell survival between flies with RubyACR expression and flies with a fluorescent protein with no opsin. The authors may also want to present lethality data for other, less broad drivers (such as MB320C, which was used for the associative memory assay) in order to demonstrate whether this problem is confined to broad drivers such as VGlut-GAL4, or if this is a problem with narrow drivers as well. If new experiments are not possible, these issues should at least be mentioned in the Discussion.

      Minor concerns

      (1) The specific method used for quantifying lethality is mentioned briefly in Table 1 but is not detailed in the Methods. The authors derive lethality by comparing to a sibling control group with either the opsin or the driver alone, but the opsin alone or driver alone may cause some lethality by themselves. We suggest the use of a viability assay, e.g. (Rockwell et al., 2019), which would give potential users a clearer picture of which developmental stage is most affected by opsin expression, as well as allow opsin-only, driver-only and experimental groups to be assessed separately (lethality would then be reported as the % of embryos that reach each stage of development, and eventually enclosure).

      (2) For the calcium imaging analysis in Figure 2, the U-shaped curve observed for mean ΔF/F0 for A1ACR1 and HfACR1 may not be due to actual desensitization for the channels, as the authors suggest (lines 143-145), but may be due simply to a shifting baseline. The authors use the 5-s period preceding stimulation onset as F0, but in some cases (e.g., HfACR1 at 250 uW/mm2), calcium fluorescence rises above baseline and remains high post-stimulation (ΔF/F0 of +0.5, which we observe is the same magnitude as the ΔF/F0 of -0.5 observed during inhibition), thus affecting the ΔF/F0 for subsequent trials. The authors should discuss this incomplete recovery in the text, or (if available) use a static channel instead to provide a stable F0 for calculating ΔF/F0. Alternatively, if the authors wish to rigorously test the hypothesis that high light intensity indeed results in desensitization of these channels, they may consider using different flies for each light intensity or longer inter-stimulus intervals.

      (3) For Figure 3C (Flybowl assay), the authors mention that "simply expressing the opsins decreased baseline locomotor activity compared to empty driver lines". However, the "Empty" controls in 3C appear to refer to opsin-only controls, not driver-only controls. The driver-only controls are not presented in the figure. The use of "empty" differs between the text and the figure, as the text refers to "empty" driver lines, while the figure uses "empty" to apparently refer to opsin-only controls. We recommend changing the terminology across all figures to be unambiguous, e.g., by using "opsin-only" or "driver-only" as opposed to the ambiguous "empty". In addition, the fact that opsin-only controls move less than driver-only controls may suggest some toxicity as a result of the opsin-only construct; this should be discussed further.

      (4) Figures 4 and 5 lack the reporting of driver-only controls.

      (5) Figures 3 and 4 lack positive controls; that is, the benchmarking of the efficacy of RubyACRs in their respective behavioral paradigms against a known inhibitor, e.g., GtACR1 with green light. To confirm that this GtACR1 transgene is functional, the authors could include GtACR1 with green light as a positive control for these two figures, as they have done for Figure 5-supplement 2 and 3.

      (6) Several citations are missing. In their discussion, the authors highlight that shorter wavelengths of light are more attenuated by tissue (lines 278-281); this should be accompanied by the relevant citations (Inagaki et al., 2014). Similarly, the claim that behavioral experiments exhibit greater sensitivity to shorter wavelengths should be substantiated (lines 281-283).

      References:

      Govorunova EG, Sineshchekov OA, Li H, Wang Y, Brown LS, Spudich JL. 2020. RubyACRs, nonalgal anion channelrhodopsins with highly red-shifted absorption. Proc Natl Acad Sci U S A 117:22833-22840.

      Inagaki HK, Jung Y, Hoopfer ED, Wong AM, Mishra N, Lin JY, Tsien RY, Anderson DJ. 2014. Optogenetic control of Drosophila using a red-shifted channelrhodopsin reveals experience-dependent influences on courtship. Nat Methods 11:325-332.

      Rockwell AL, Beaver I, Hongay CF. 2019. A direct and simple method to assess Drosophila melanogaster's viability from embryo to adult. J Vis Exp e59996.

    3. Author response:

      We thank the reviewers for their thoughtful and thorough consideration of the work. We appreciate the positive reception they give the work, and plan to address several of the comments with further experiments. To outline that work (and ensure that we are on the right track to addressing those concerns), we summarize the core concerns that prompt new experiments:

      (1) Does the YFP tag on the ACRs interfere with simultaneous GCaMP imaging of RubyACR-expressing cells and could bleaching of the YFP complicate interpretation of the experiments here?

      We will test whether 920 nm (2p) and 650 nm (1p) excitation cause YFP bleaching that interferes with interpretation of inhibitory calcium (i.e. GCaMP) signals. Because the YFP tag enhances opsin sensitivity, we prioritized these tagged RubyACRs for initial characterization. FLAG-tagged ACRs are in progress, but will take time to fully characterize. Considering that the RubyACR-EYFP versions work very well, and in many cases people will want the YFP tag, either for visualizing expression or to maximize sensitivity, we feel the current work is a valuable contribution on its own. Indeed several labs have already requested these lines.

      (2) Are the ACRs activated by two-photon illumination?

      We will examine GCaMP signals at increasing 2p intensities to determine whether imaging unintentionally activates RubyACRs, as well as whether 2p illumination could be used for intentional opsin activation.

      (3) How toxic is the expression of these opsins?

      We will update the quantification of toxicity in Table 1 to include all the drivers we used in this study. In fact the toxicity we observed was primarily with the vGlut driver, which was why that was the only information in the table. The other drivers we used did not appreciably reduce survival rate, but showing the one case where it did have a big effect left a strong and understandably inaccurate impression that toxicity was a big pitfall. We note that the widely used CSChrimson has similar % survival to the RubyACRs when expressed with these vGlut drivers.

      We also plan to examine whether ACR expression leads to cell-autonomous perturbations. We will determine whether expression leads to some frequency of neuronal cell death, and we will evaluate whether any morphological effects occur.

      We will also clarify in the Discussion that potential toxicity may be driver-specific (as it is here) and should be evaluated case-by-case by investigators using the tool.

      (4) Use functional imaging to confirm inhibition of the neurons used only for behavioral experiments (pIP10 & PPL1-γ1pedc)

      We will perform these imaging experiments. One caveat is that inhibition may not be readily detectable with GCaMP, as the resting calcium levels in pIP10 and PPL1-γ1pedc neurons may already be quite low. This differs from the non-spiking Mi1 neurons, where inhibition was clearly observed with GCaMP. For this reason, we consider the behavioral results stronger evidence of efficacy, but we agree that imaging could provide useful supporting evidence, recognizing that a negative result would be difficult to interpret.

      (5) Confirm that the GtACR1 will inhibit locomotion in the flybowl when activated with green light, its spectral peak.

      We will perform this benchmark experiment. Please note that our intention with this study was to find an effective red-light activated opto-inhibitor because these wavelengths are much less perturbing to behavior. In that respect, regardless of GtACR1’s performance with green light, the RubyACRs clearly provide important new tools for Drosophila behavioral neuroscience.

  4. www.planalto.gov.br www.planalto.gov.br
    1. modificar
      • Informativo nº 806
      • CORTE ESPECIAL
      • Processo: EAREsp 1.766.665-RS, Rel. Ministro Francisco Falcão, Rel. para acórdão Ministro Ricardo Villas Bôas Cueva, Corte Especial, por maioria, julgado em 3/4/2024.

      Ramo do Direito DIREITO PROCESSUAL CIVIL

      TemaPaz, Justiça e Instituições Eficazes <br /> Multa cominatória. Valor exorbitante. Desproporcionalidade. Valor acumulado. Possiblidade de revisão. Exigência de postura ativa do devedor. Sucessivas revisões. Impossibilidade. Preclusão consumativa.

      DESTAQUE - Incide a preclusão consumativa sobre o montante acumulado da multa cominatória, de forma que, já tendo havido modificação, não é possível nova alteração, preservando-se as situações já consolidadas.

      INFORMAÇÕES DO INTEIRO TEOR - A controvérsia diz respeito à ocorrência de preclusão sobre decisão que revisa o valor de astreintes. Sobre tema, a Corte Especial, no julgamento do EAREsp n. 650.536-RJ, firmou o entendimento de ser possível a redução quando o valor for exorbitante, levando-se em conta a razoabilidade e a proporcionalidade, e a fim de evitar o enriquecimento sem causa do credor.

      • No entanto, a questão demanda reflexões mais aprofundadas, especialmente porque essa decisão, muito embora tenha sido proferida sob a égide do CPC atual, baseou-se especialmente em jurisprudência majoritária construída à época em que vigia o CPC/1973, com destaque para o Tema Repetitivo n. 706: "A decisão que comina astreintes não preclui, não fazendo tampouco coisa julgada" (REsp n. 1.333.988/SP, Segunda Seção, Rel. Ministro Paulo de Tarso Sanseverino, DJe 11/4/2014).

      • Além disso, não se levou em consideração que o CPC/2015 alterou substancial e expressamente o regime jurídico das astreintes no tocante à possibilidade de modificação. Com efeito, de acordo com a premissa estabelecida no julgamento do EAREsp n. 650.536-RJ, a regra que permite ao magistrado alterar a multa cominatória estaria prevista no art. 461, § 6°, do CPC/1973 e no seu correspondente, art. 537, § 1°, do CPC/2015. Todavia, há uma diferença substancial entre essas duas regras, em particular no que diz respeito a quais valores podem ser modificados.

      • A partir da análise dessas regras supracitadas, percebe-se a nítida intenção do legislador de autorizar a revisão ou a exclusão apenas da "multa <u>vincenda</u>", ou seja, a decisão não pode ter eficácia retroativa para atingir o montante acumulado da multa. Por outro lado, há quem sustente a possibilidade de decisão com efeitos retroativos no caso de redução do montante da multa que já incidiu, pois a expressão "vincendas" diria respeito apenas à multa que está incidindo.

      • Contudo, não há motivo para submeter a modificação e a exclusão a regimes jurídicos diversos. A regra do art. 537, § 1°, do CPC deixa claro que o legislador optou por preservar as situações já consolidadas, independentemente de se tratar da multa que está incidindo ou do montante oriundo da sua incidência. Analisando a questão com mais profundidade, tem-se que a pendência de discussão acerca do montante da multa não guarda relação com o seu vencimento, mas, sim, com a sua definitividade.

      • Dessa forma, se a incidência da multa durante o período de inadimplência alcança valores exorbitantes, seja porque o devedor permaneceu inerte e não requereu a revisão ou exclusão, seja porque o magistrado não agiu de ofício, qualquer decisão que venha a ser proferida somente poderia provocar, em regra, efeitos <u>prospectivos</u>.

      • Percebe-se que o legislador do CPC/2015 optou por levar em consideração a postura do devedor, a fim de premiar aquele que, muito embora inadimplente num primeiro momento, acaba por cumprir a obrigação, ainda que parcialmente, ou que demonstra a impossibilidade de cumprimento. Significa dizer que somente tem direito à redução da multa aquele que abandona a recalcitrância.

      • Desse modo, a partir da regra expressa do art. 537, §1°, do CPC, somente seria possível alterar o valor acumulado das multas vincendas e, consoante disposto no inciso II, a redução exige postura <u>ativa</u> do devedor, consubstanciada no cumprimento parcial da obrigação ou na demonstração de sua impossibilidade.

      • De qualquer sorte, na hipótese, há outro óbice para a revisão pretendida, qual seja a preclusão pro judicato consumativa, pois já havia sido revisado o valor da multa diária.

      • O STJ sedimentou, por meio de recurso especial julgado na sistemática dos repetitivos, que "a decisão que comina astreintes não preclui, não fazendo tampouco coisa julgada" (Tema 706), conforme já anotado. Trata-se, no entanto, de não incidência de preclusão <u>temporal</u>, de forma que o valor da multa pode ser modificado a qualquer tempo. Não se trata de ausência de preclusão consumativa, sob pena de grave violação da segurança jurídica.

      • Dessa forma, uma vez fixada a multa, é possível alterá-la ou excluí-la a qualquer momento. No entanto, uma vez reduzido o valor, não serão lícitas sucessivas revisões, a bel prazer do inadimplente recalcitrante, sob pena de estimular e premiar a renitência sem justa causa. <u>Em outras palavras, é possível modificar a decisão que comina a multa, mas não é lícito modificar o que já foi modificado</u>.

      • Considerando que a multa cominatória é um importantíssimo instrumento para garantir a efetividade das decisões judiciais e pode ser fixada de ofício, trata-se de matéria de ordem pública. No caso, a multa fixada em sentença transitada em julgado pode ser alterada na fase de execução porque tem natureza de técnica processual, de modo que não é acobertada pela coisa julgada material. Uma vez fixada ou alterada no início da execução, mantém tal natureza e, portanto, pode ser modificada a qualquer momento, inclusive de ofício.

      • Todavia, o valor acumulado da multa deixa de ser técnica processual e passa a integrar o patrimônio do exequente como crédito de valor, perdendo a natureza de matéria de ordem pública. Com efeito, nos termos do art. 537, § 2°, do CPC, "o valor [acumulado] da multa será devido ao exequente".

      • Além disso, mesmo se considerada também a multa acumulada como matéria de ordem pública, deve incidir a preclusão pro judicato consumativa, de forma que, tendo havido modificação, não é possível nova alteração, preservando-se as situações já consolidadas, como deixa claro o art. 537, § 1°, do CPC ao se referir a "multa vincenda". Isso porque há preclusão consumativa em relação às questões de ordem pública, inclusive àquelas que estão fora da esfera de disponibilidade das partes, tais como os pressupostos processuais e as condições da ação, conforme entendimento sedimentado no STJ.

      • Assim sendo, e com maior razão, há preclusão consumativa no tocante ao montante acumulado da multa cominatória, pois ostenta natureza patrimonial e disponível.

    1. Reviewer #1 (Public review):

      Summary:

      Review of the manuscript titled " Mycobacterial Metallophosphatase MmpE acts as a nucleomodulin to regulate host gene expression and promotes intracellular survival".

      The study provides an insightful characterization of the mycobacterial secreted effector protein MmpE, which translocates to the host nucleus and exhibits phosphatase activity. The study characterizes the nuclear localization signal sequences and residues critical for the phosphatase activity, both of which are required for intracellular survival.

      Strengths:

      (1) The study addresses the role of nucleomodulins, an understudied aspect in mycobacterial infections.

      (2) The authors employ a combination of biochemical and computational analyses along with in vitro and in vivo validations to characterize the role of MmpE.

      Weaknesses:

      (1) While the study establishes that the phosphatase activity of MmpE operates independently of its NLS, there is a clear gap in understanding how this phosphatase activity supports mycobacterial infection. The investigation lacks experimental data on specific substrates of MmpE or pathways influenced by this virulence factor.

      (2) The study does not explore whether the phosphatase activity of MmpE is dependent on the NLS within macrophages, which would provide critical insights into its biological relevance in host cells. Conducting experiments with double knockout/mutant strains and comparing their intracellular survival with single mutants could elucidate these dependencies and further validate the significance of MmpE's dual functions.

      (3) The study does not provide direct experimental validation of the MmpE deletion on lysosomal trafficking of the bacteria.

      (4) The role of MmpE as a mycobacterial effector would be more relevant using virulent mycobacterial strains such as H37Rv.

    2. Reviewer #3 (Public review):

      Summary:

      In this manuscript titled "Mycobacterial Metallophosphatase MmpE Acts as a Nucleomodulin to Regulate Host Gene Expression and Promote Intracellular Survival", Chen et al describe biochemical characterisation, localisation and potential functions of the gene using a genetic approach in M. bovis BCG and perform macrophage and mice infections to understand the roles of this potentially secreted protein in the host cell nucleus. The findings demonstrate the role of a secreted phosphatase of M. bovis BCG in shaping the transcriptional profile of infected macrophages, potentially through nuclear localisation and direct binding to transcriptional start sites, thereby regulating the inflammatory response to infection.

      Strengths:

      The authors demonstrate using a transient transfection method that MmpE when expressed as a GFP-tagged protein in HEK293T cells, exhibits nuclear localisation. The authors identify two NLS motifs that together are required for nuclear localisation of the protein. A deletion of the gene in M. bovis BCG results in poorer survival compared to the wild-type parent strain, which is also killed by macrophages. Relative to the WT strain-infected macrophages, macrophages infected with the ∆mmpE strain exhibited differential gene expression. Overexpression of the gene in HEK293T led to occupancy of the transcription start site of several genes, including the Vitamin D Receptor. Expression of VDR in THP1 macrophages was lower in the case of ∆mmpE infection compared to WT infection. This data supports the utility of the overexpression system in identifying potential target loci of MmpE using the HEK293T transfection model. The authors also demonstrate that the protein is a phosphatase, and the phosphatase activity of the protein is partially required for bacterial survival but not for the regulation of the VDR gene expression.

      Weaknesses:

      (1) While the motifs can most certainly behave as NLSs, the overexpression of a mycobacterial protein in HEK293T cells can also result in artefacts of nuclear localisation. This is not unprecedented. Therefore, to prove that the protein is indeed secreted from BCG, and is able to elicit transcriptional changes during infection, I recommend that the authors (i) establish that the protein is indeed secreted into the host cell nucleus, and (ii) the NLS mutation prevents its localisation to the nucleus without disrupting its secretion.

      Demonstration that the protein is secreted: Supplementary Figure 3 - Immunoblotting should be performed for a cytosolic protein, also to rule out detection of proteins from lysis of dead cells. Also, for detecting proteins in the secreted fraction, it would be better to use Sauton's media without detergent, and grow the cultures without agitation or with gentle agitation. The method used by the authors is not a recommended protocol for obtaining the secreted fraction of mycobacteria.

      Demonstration that the protein localises to the host cell nucleus upon infection: Perform an infection followed by immunofluorescence to demonstrate that the endogenous protein of BCG can translocate to the host cell nucleus. This should be done for an NLS1-2 mutant expressing cell also.

      (2) In the RNA-seq analysis, the directionality of change of each of the reported pathways is not apparent in the way the data have been presented. For example, are genes in the cytokine-cytokine receptor interaction or TNF signalling pathway expressed more, or less in the ∆mmpE strain?

      (3) Several of these pathways are affected as a result of infection, while others are not induced by BCG infection. For example, BCG infection does not, on its own, produce changes in IL1β levels. As the authors did not compare the uninfected macrophages as a control, it is difficult to interpret whether ∆mmpE induced higher expression than the WT strain, or simply did not induce a gene while the WT strain suppressed expression of a gene. This is particularly important because the strain is attenuated. Does the attenuation have anything to do with the ability of the protein to induce lysosomal pathway genes? Does induction of this pathway lead to attenuation of the strain? Similarly, for pathways that seem to be downregulated in the ∆mmpE strain compared to the WT strain, these might have been induced upon infection with the WT strain but not sufficiently by the ∆mmpE strain due to its attenuation/ lower bacterial burden.

      (4) CHIP-seq should be performed in THP1 macrophages, and not in HEK293T. Overexpression of a nuclear-localised protein in a non-relevant line is likely to lead to several transcriptional changes that do not inform us of the role of the gene as a transcriptional regulator during infection.

      (5) I would not expect to see such large inflammatory reactions persisting 56 days post-infection with M. bovis BCG. Is this something peculiar for an intratracheal infection with 1x107 bacilli? For images of animal tissue, the authors should provide images of the entire lung lobe with the zoomed-in image indicated as an inset.

      (6) For the qRT-PCR based validation, infections should be performed with the MmpE-complemented strain in the same experiments as those for the WT and ∆mmpE strain so that they can be on the same graph, in the main manuscript file. Supplementary Figure 4 has three complementary strains. Again, the absence of the uninfected, WT, and ∆mmpE infected condition makes interpretation of these data very difficult.

      (7) The abstract mentions that MmpE represses the PI3K-Akt-mTOR pathway, which arrests phagosome maturation. There is not enough data in this manuscript in support of this claim. Supplementary Figure 5 does provide qRT-PCR validation of genes of this pathway, but the data do not indicate that higher expression of these pathways, whether by VDR repression or otherwise, is driving the growth restriction of the ∆mmpE strain.

      (8) The relevance of the NLS and the phosphatase activity is not completely clear in the CFU assays and in the gene expression data. Firstly, there needs to be immunoblot data provided for the expression and secretion of the NLS-deficient and phosphatase mutants. Secondly, CFU data in Figure 3A, C, and E must consistently include both the WT and ∆mmpE strain.

    3. Author response:

      Reviewer #1 (Public review):

      Summary:

      Review of the manuscript titled " Mycobacterial Metallophosphatase MmpE acts as a nucleomodulin to regulate host gene expression and promotes intracellular survival".

      The study provides an insightful characterization of the mycobacterial secreted effector protein MmpE, which translocates to the host nucleus and exhibits phosphatase activity. The study characterizes the nuclear localization signal sequences and residues critical for the phosphatase activity, both of which are required for intracellular survival.

      Strengths:

      (1) The study addresses the role of nucleomodulins, an understudied aspect in mycobacterial infections.

      (2) The authors employ a combination of biochemical and computational analyses along with in vitro and in vivo validations to characterize the role of MmpE.

      Weaknesses:

      (1) While the study establishes that the phosphatase activity of MmpE operates independently of its NLS, there is a clear gap in understanding how this phosphatase activity supports mycobacterial infection. The investigation lacks experimental data on specific substrates of MmpE or pathways influenced by this virulence factor.

      We thank the reviewer for this insightful comment and agree that identification of the substrate of MmpE is important to fully understand its role in mycobacterial infection.

      MmpE is a putative purple acid phosphatase (PAP) and a member of the metallophosphoesterase (MPE) superfamily. Enzymes in this family are known for their catalytic promiscuity and broad substrate specificity, acting on phosphomonoesters, phosphodiesters, and phosphotriesters (Matange et al., Biochem J., 2015). In bacteria, several characterized MPEs have been shown to hydrolyze substrates such as cyclic nucleotides (e.g., cAMP) (Keppetipola et al., J Biol Chem, 2008; Shenoy et al., J Mol Biol, 2007), nucleotide derivatives (e.g., AMP, UDP-glucose) (Innokentev et al., mBio, 2025), and pyrophosphate-containing compounds (e.g., Ap4A, UDP-DAGn) (Matange et al., Biochem J., 2015). Although the binding motif of MmpE has been identified, determining its physiological substrates remains challenging due to the low abundance and instability of potential metabolites, as well as the limited sensitivity and coverage of current metabolomic technologies in mycobacteria.

      (2) The study does not explore whether the phosphatase activity of MmpE is dependent on the NLS within macrophages, which would provide critical insights into its biological relevance in host cells. Conducting experiments with double knockout/mutant strains and comparing their intracellular survival with single mutants could elucidate these dependencies and further validate the significance of MmpE's dual functions.

      We thank the reviewer for the comment. In our study, we demonstrate that both the nuclear localization and phosphatase activity of MmpE are required for full virulence (Figure 3D–E). Importantly, deletion of the NLS motifs did not impair MmpE’s phosphatase activity in vitro (Figure 2F), indicating that its enzymatic function is structurally independent of its nuclear localization. These findings suggest that MmpE functions as a bifunctional protein, with distinct and non-overlapping roles for its nuclear trafficking and phosphatase activity. We have expanded on this point in the Discussion section “MmpE Functions as a Bifunctional Protein with Nuclear Localization and Phosphatase Activity”.

      (3) The study does not provide direct experimental validation of the MmpE deletion on lysosomal trafficking of the bacteria.

      We thank the reviewer for the comment. The role of Rv2577/MmpE in phagosome maturation has been demonstrated in M. tuberculosis, where its deletion increases colocalization with lysosomal markers such as LAMP-2 and LAMP-3 (Forrellad et al., Front Microbiol, 2020). In our study, we found that mmpE deletion in M. bovis BCG led to upregulation of lysosomal genes, including TFEB, LAMP1, LAMP2, and v-ATPase subunits, compared to the wild-type strain. These results suggest that MmpE may regulate lysosomal trafficking by interfering with phagosome–lysosome fusion.

      To further validate MmpE’s role in phagosome maturation, we will perform fluorescence colocalization assays in THP-1 macrophages infected with BCG/wt, ∆mmpE, complemented, and NLS-mutant strains. Co-staining with LAMP1 and LysoTracker will allow us to assess whether the ∆mmpE mutant is more efficiently trafficked to lysosomes.

      (4) The role of MmpE as a mycobacterial effector would be more relevant using virulent mycobacterial strains such as H37Rv.

      We thank the reviewer for the comment. Previously, the role of Rv2577/MmpE as a virulence factor has been demonstrated in M. tuberculosis CDC 1551, where its deletion significantly reduced bacterial replication in mouse lungs at 30 days post-infection (Forrellad et al., Front Microbiol, 2020). However, that study did not explore the underlying mechanism of MmpE function. In our work, we found that MmpE enhances M. bovis BCG survival in both macrophages (THP-1 and RAW264.7) and mice (Figure 2A-B, Figure 6A), consistent with its proposed role in virulence. To investigate the molecular mechanism by which MmpE promotes intracellular survival, we used M. bovis BCG as a biosafe surrogate and this model is widely accepted for studying mycobacterial pathogenesis (Wang et al., Nat Immunol, 2025; Wang et al., Nat Commun, 2017; Péan et al., Nat Commun, 2017).

      Reviewer #2 (Public review):

      Summary:

      In this paper, the authors have characterized Rv2577 as a Fe3+/Zn2+ -dependent metallophosphatase and a nucleomodulin protein. The authors have also identified His348 and Asn359 as critical residues for Fe3+ coordination. The authors show that the proteins encode for two nuclease localization signals. Using C-terminal Flag expression constructs, the authors have shown that the MmpE protein is secretory. The authors have prepared genetic deletion strains and show that MmpE is essential for intracellular survival of M. bovis BCG in THP-1 macrophages, RAW264.7 macrophages, and a mouse model of infection. The authors have also performed RNA-seq analysis to compare the transcriptional profiles of macrophages infected with wild-type and MmpE mutant strains. The relative levels of ~ 175 transcripts were altered in MmpE mutant-infected macrophages and the majority of these were associated with various immune and inflammatory signalling pathways. Using these deletion strains, the authors proposed that MmpE inhibits inflammatory gene expression by binding to the promoter region of a vitamin D receptor. The authors also showed that MmpE arrests phagosome maturation by regulating the expression of several lysosome-associated genes such as TFEB, LAMP1, LAMP2, etc. These findings reveal a sophisticated mechanism by which a bacterial effector protein manipulates gene transcription and promotes intracellular survival.

      Strength:

      The authors have used a combination of cell biology, microbiology, and transcriptomics to elucidate the mechanisms by which Rv2577 contributes to intracellular survival.

      Weakness:

      The authors should thoroughly check the mice data and show individual replicate values in bar graphs.

      We kindly appreciate the reviewer for the advice. We will update the relevant mice data in the revised manuscript.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript titled "Mycobacterial Metallophosphatase MmpE Acts as a Nucleomodulin to Regulate Host Gene Expression and Promote Intracellular Survival", Chen et al describe biochemical characterisation, localisation and potential functions of the gene using a genetic approach in M. bovis BCG and perform macrophage and mice infections to understand the roles of this potentially secreted protein in the host cell nucleus. The findings demonstrate the role of a secreted phosphatase of M. bovis BCG in shaping the transcriptional profile of infected macrophages, potentially through nuclear localisation and direct binding to transcriptional start sites, thereby regulating the inflammatory response to infection.

      Strengths:

      The authors demonstrate using a transient transfection method that MmpE when expressed as a GFP-tagged protein in HEK293T cells, exhibits nuclear localisation. The authors identify two NLS motifs that together are required for nuclear localisation of the protein. A deletion of the gene in M. bovis BCG results in poorer survival compared to the wild-type parent strain, which is also killed by macrophages. Relative to the WT strain-infected macrophages, macrophages infected with the ∆mmpE strain exhibited differential gene expression. Overexpression of the gene in HEK293T led to occupancy of the transcription start site of several genes, including the Vitamin D Receptor. Expression of VDR in THP1 macrophages was lower in the case of ∆mmpE infection compared to WT infection. This data supports the utility of the overexpression system in identifying potential target loci of MmpE using the HEK293T transfection model. The authors also demonstrate that the protein is a phosphatase, and the phosphatase activity of the protein is partially required for bacterial survival but not for the regulation of the VDR gene expression.

      Weaknesses:

      (1)   While the motifs can most certainly behave as NLSs, the overexpression of a mycobacterial protein in HEK293T cells can also result in artefacts of nuclear localisation. This is not unprecedented. Therefore, to prove that the protein is indeed secreted from BCG, and is able to elicit transcriptional changes during infection, I recommend that the authors (i) establish that the protein is indeed secreted into the host cell nucleus, and (ii) the NLS mutation prevents its localisation to the nucleus without disrupting its secretion.

      We kindly appreciate the reviewer for the advice and will include the relevant experiments in the revised manuscript. The localization of WT MmpE and the NLS mutated MmpE will be tested in the BCG infected macrophages.

      Demonstration that the protein is secreted: Supplementary Figure 3 - Immunoblotting should be performed for a cytosolic protein, also to rule out detection of proteins from lysis of dead cells. Also, for detecting proteins in the secreted fraction, it would be better to use Sauton's media without detergent, and grow the cultures without agitation or with gentle agitation. The method used by the authors is not a recommended protocol for obtaining the secreted fraction of mycobacteria.

      We agree with the reviewer and we will further validate the secretion of MmpE using the tested protocol.

      Demonstration that the protein localises to the host cell nucleus upon infection: Perform an infection followed by immunofluorescence to demonstrate that the endogenous protein of BCG can translocate to the host cell nucleus. This should be done for an NLS1-2 mutant expressing cell also.

      We will add this experiment in the revised manuscript.

      (2) In the RNA-seq analysis, the directionality of change of each of the reported pathways is not apparent in the way the data have been presented. For example, are genes in the cytokine-cytokine receptor interaction or TNF signalling pathway expressed more, or less in the ∆mmpE strain?

      We thank the reviewer for pointing this out and fully agree that conventional KEGG pathway enrichment diagrams do not convey the directionality of individual gene expression changes within each pathway. While KEGG enrichment analysis identifies pathways that are statistically overrepresented among differentially expressed genes, it does not indicate whether individual genes within those pathways are upregulated or downregulated.

      To address this, we re-analyzed the expression trends of DEGs within each significantly enriched KEGG pathway. The results show that key immune-related pathways, including cytokine–cytokine receptor interaction, TNF signaling, NF-κB signaling, and chemokine signaling, are collectively upregulated in THP-1 macrophages infected with ∆mmpE strain compared to those infected with the wild-type BCG strain. The full list of DEGs will be provided in the supplementary materials. The complete RNA-seq dataset has been deposited in the GEO database, and the accession number will be included in the revised manuscript.

      (3) Several of these pathways are affected as a result of infection, while others are not induced by BCG infection. For example, BCG infection does not, on its own, produce changes in IL1β levels. As the author s did not compare the uninfected macrophages as a control, it is difficult to interpret whether ∆mmpE induced higher expression than the WT strain, or simply did not induce a gene while the WT strain suppressed expression of a gene. This is particularly important because the strain is attenuated. Does the attenuation have anything to do with the ability of the protein to induce lysosomal pathway genes? Does induction of this pathway lead to attenuation of the strain? Similarly, for pathways that seem to be downregulated in the ∆mmpE strain compared to the WT strain, these might have been induced upon infection with the WT strain but not sufficiently by the ∆mmpE strain due to its attenuation/ lower bacterial burden.

      We thank the reviewer for the comment. We will update qRT-PCR data with the uninfected macrophages as a control in the revised manuscript.

      Wild-type Mycobacterium bovis BCG strain still has the function of inhibiting phagosome maturation (Branzk et al., Nat Immunol, 2014; Weng et al., Nat Commun, 2022). Forrellad et al. previously identified Rv2577/MmpE as a virulence factor in M. tuberculosis and disruption of the MmpE gene impairs the ability of M. tuberculosis to arrest phagosome maturation (Forrellad et al., Front Microbiol, 2020). In our study, transcriptomic and qRTPCR data (Figures 4C and G, S4C) show that deletion of mmpE in M. bovis BCG leads to upregulation of lysosomal biogenesis and acidification genes, including TFEB, LAMP1, and vATPase. To further validate MmpE’s role in phagosome maturation, we will perform fluorescence colocalization assays in THP-1 macrophages infected with BCG/wt, ∆mmpE, complemented, and NLS-mutant strains. Co-staining with LAMP1 and LysoTracker will assess whether the ∆mmpE mutant is more efficiently trafficked to lysosomes.

      Furthermore, CFU assays demonstrated that the ∆mmpE strain exhibits markedly reduced bacterial survival in both human THP-1 and murine RAW264.7 macrophages, as well as in mice, compared to the wild-type strain (Figures 4A and C, 6A). These findings suggest that the loss of MmpE compromises bacterial survival, likely due to enhanced lysosomal trafficking and acidification. This supports previous studies showing that increased lysosomal activity promotes mycobacterial clearance (Gutierrez et al., Cell, 2004; Pilli et al., Immunity, 2012).

      (4) CHIP-seq should be performed in THP1 macrophages, and not in HEK293T. Overexpression of a nuclear-localised protein in a non-relevant line is likely to lead to several transcriptional changes that do not inform us of the role of the gene as a transcriptional regulator during infection.

      We thank the reviewer for the comment. We performed ChIP-seq in HEK293T cells is based on the fact that this cell line is widely used in ChIP-based assays due to its high transfection efficiency, robust nuclear protein expression, and well-annotated genome (Lampe et al., Nat Biotechnol, 2024; Marasco et al., Cell, 2022). These features make HEK293T an ideal system for the initial identification of genome wide chromatin binding profiles of novel nuclear effectors such as MmpE.

      Furthermore, we validated the major observations in THP-1 macrophages, including (i) RNAseq of THP-1 cells infected with either WT BCG or ∆mmpE strains revealed significant transcriptional changes in immune and lysosomal pathways (Figure 4A); (ii) Integrated analysis of CUT&Tag and RNA-seq data identified 298 genes in infected THP-1 cells that exhibited both MmpE binding and corresponding expression changes. Among these, VDR was validated as a direct transcriptional target of MmpE using EMSA and ChIP-PCR (Figures 5E-J, S5D-F). Notably, the signaling pathways associated with MmpE-bound genes, including PI3K-Akt-mTOR signaling and lysosomal function, substantially overlap with those transcriptionally modulated in infected THP-1 macrophages (Figures 4B-G, S4B-C, S5C-D), further supporting the biological relevance of the ChIP-seq data obtained from HEK293T cells.

      (5) I would not expect to see such large inflammatory reactions persisting 56 days postinfection with M. bovis BCG. Is this something peculiar for an intratracheal infection with 1x107 bacilli? For images of animal tissue, the authors should provide images of the entire lung lobe with the zoomed-in image indicated as an inset.

      We thank the reviewer for the comment. The lung inflammation peaked at days 21–28 and had clearly subsided by day 56 across all groups (Figure 6B), consistent with the expected resolution of immune responses to an attenuated strain like M. bovis BCG. This temporal pattern is in line with previous studies using intravenous or intratracheal BCG vaccination in mice and macaques, which also demonstrated robust early immune activation followed by resolution over time (Smith et al., Nat Microbiol, 2025; Darrah et al., Nature, 2020).

      In this study, the infectious dose (1×10⁷ CFU intratracheally) was selected based on previous studies in which intratracheal delivery of 1×10⁷CFU produced consistent and measurable lung immune responses and pathology without causing overt illness or mortality (Xu et al., Sci Rep, 2017; Niroula et al., Sci Rep, 2025). We will provide whole-lung lobe images with zoomed-in insets in the revised manuscript.

      (6) For the qRT-PCR based validation, infections should be performed with the MmpEcomplemented strain in the same experiments as those for the WT and ∆mmpE strain so that they can be on the same graph, in the main manuscript file. Supplementary Figure 4 has three complementary strains. Again, the absence of the uninfected, WT, and∆mmpE infected condition makes interpretation of these data very difficult.

      We thank the reviewer for the comment. As suggested, we will conduct the qRT-PCR experiment including the uninfected, WT, ∆mmpE, Comp-MmpE, and the three complementary strains infecting THP-1 cells. The updated data will be provided in the revised manuscript.

      (7) The abstract mentions that MmpE represses the PI3K-Akt-mTOR pathway, which arrests phagosome maturation. There is not enough data in this manuscript in support of this claim. Supplementary Figure 5 does provide qRT-PCR validation of genes of this pathway, but the data do not indicate that higher expression of these pathways, whether by VDR repression or otherwise, is driving the growth restriction of the ∆mmpE strain.

      We thank the reviewer for the comment. The role of MmpE in phagosome maturation was previously characterized. Disruption of mmpE impairs the ability of M. tuberculosis to arrest lysosomal trafficking (Forrellad et al., Front Microbiol, 2020). In this study, we further found that MmpE suppresses the expression of key lysosomal genes, including TFEB, LAMP1, LAMP2, and ATPase subunits (Figure 4G), suggesting MmpE is involved in arresting phagosome maturation. As noted, the genes in the PI3K–Akt–mTOR pathway are upregulated in ∆mmpE-infected macrophages (Figure S5C).

      To functionally validate this, we will conduct two complementary experimental approaches:

      (i) Immunofluorescence assays: We will assess phagosome maturation and lysosomal fusion in THP-1 cells infected with BCG/wt, ∆mmpE, Comp-MmpE, and NLS mutant strains. Colocalization of intracellular bacteria with LAMP1 and LysoTracker will be quantified to determine whether the ∆mmpE strain is more efficiently trafficked to lysosomes.

      (ii) CFU assays: We will perform CFU assays in THP-1 cells infected with BCG/wt or ∆mmpE in the presence or absence of PI3K-Akt-mTOR pathway inhibitors (e.g., Dactolisib), to assess whether activation of this pathway contributes to the intracellular growth restriction observed in the ∆mmpE strain.

      (8) The relevance of the NLS and the phosphatase activity is not completely clear in the CFU assays and in the gene expression data. Firstly, there needs to be immunoblot data provided for the expression and secretion of the NLS-deficient and phosphatase mutants. Secondly, CFU data in Figure 3A, C, and E must consistently include both the WT and ∆mmpE strain.

      We thank the reviewer for the comment. We will provide immunoblot data for the expression and secretion of the NLS-deficient and phosphatase mutants. Additionally, we will revise Figure 3A, 3C, and 3E to consistently include both the WT and ΔmmpE strains in the CFU assays.

      Reference

      Branzk N, Lubojemska A, Hardison SE, Wang Q, Gutierrez MG, Brown GD, Papayannopoulos V (2014) Neutrophils sense microbe size and selectively release neutrophil extracellular traps in response to large pathogens Nat Immunol 15:1017-25.

      Darrah PA, Zeppa JJ, Maiello P, Hackney JA, Wadsworth MH 2nd, Hughes TK, Pokkali S, Swanson PA 2nd, Grant NL, Rodgers MA, Kamath M, Causgrove CM, Laddy DJ, Bonavia A, Casimiro D, Lin PL, Klein E, White AG, Scanga CA, Shalek AK, Roederer M, Flynn JL, Seder RA (2020) Prevention of tuberculosis in macaques after intravenous BCG immunization Nature 577:95-102.

      Forrellad MA, Blanco FC, Marrero Diaz de Villegas R, Vázquez CL, Yaneff A, García EA, Gutierrez MG, Durán R, Villarino A, Bigi F (2020) Rv2577 of Mycobacterium tuberculosis Is a virulence factor with dual phosphatase and phosphodiesterase functions Front Microbiol 11:570794.

      Gutierrez MG, Master SS, Singh SB, Taylor GA, Colombo MI, Deretic V (2004) Autophagy is a defense mechanism inhibiting BCG and Mycobacterium tuberculosis survival in infected macrophages Cell 119:753-66.

      Innokentev A, Sanchez AM, Monetti M, Schwer B, Shuman S (2025) Efn1 and Efn2 are extracellular 5'-nucleotidases induced during the fission yeast response to phosphate starvation mBio 16: e0299224.

      Keppetipola N, Shuman S (2008) A phosphate-binding histidine of binuclear metallophosphodiesterase enzymes is a determinant of 2',3'-cyclic nucleotide phosphodiesterase activity J Biol Chem 283:30942-9.

      Lampe GD, King RT, Halpin-Healy TS, Klompe SE, Hogan MI, Vo PLH, Tang S, Chavez A, Sternberg SH (2024) Targeted DNA integration in human cells without double-strand breaks using CRISPR-associated transposases Nat Biotechnol 42:87-98.

      Marasco LE, Dujardin G, Sousa-Luís R, Liu YH, Stigliano JN, Nomakuchi T, Proudfoot NJ, Krainer AR, Kornblihtt AR (2022) Counteracting chromatin effects of a splicing-correcting antisense oligonucleotide improves its therapeutic efficacy in spinal muscular atrophy Cell 185:2057-2070.e15.

      Matange N, Podobnik M, Visweswariah SS (2015) Metallophosphoesterases: structural fidelity with functional promiscuity Biochem J 467:201-16.

      Niroula N, Ghodasara P, Marreros N, Fuller B, Sanderson H, Zriba S, Walker S, Shury TK, Chen JM (2025) Orally administered live BCG and heat-inactivated Mycobacterium bovis protect bison against experimental bovine tuberculosis Sci Rep 15:3764.

      Péan CB, Schiebler M, Tan SW, Sharrock JA, Kierdorf K, Brown KP, Maserumule MC,

      Menezes S, Pilátová M, Bronda K, Guermonprez P, Stramer BM, Andres Floto R, Dionne MS (2017) Regulation of phagocyte triglyceride by a STAT-ATG2 pathway controls mycobacterial infection Nat Commun 8:14642.

      Pilli M, Arko-Mensah J, Ponpuak M, Roberts E, Master S, Mandell MA, Dupont N, Ornatowski W, Jiang S, Bradfute SB, Bruun JA, Hansen TE, Johansen T, Deretic V (2012) TBK-1 promotes autophagy-mediated antimicrobial defense by controlling autophagosome maturation Immunity 37:223-34.

      Shenoy AR, Capuder M, Draskovic P, Lamba D, Visweswariah SS, Podobnik M (2007) Structural and biochemical analysis of the Rv0805 cyclic nucleotide phosphodiesterase from Mycobacterium tuberculosis J Mol Biol 365:211-25.

      Smith AA, Su H, Wallach J, Liu Y, Maiello P, Borish HJ, Winchell C, Simonson AW, Lin PL, Rodgers M, Fillmore D, Sakal J, Lin K, Vinette V, Schnappinger D, Ehrt S, Flynn JL (2025) A BCG kill switch strain protects against Mycobacterium tuberculosis in mice and non-human primates with improved safety and immunogenicity Nat Microbiol 10:468-481.

      Wang J, Ge P, Qiang L, Tian F, Zhao D, Chai Q, Zhu M, Zhou R, Meng G, Iwakura Y, Gao GF, Liu CH (2017) The mycobacterial phosphatase PtpA regulates the expression of host genes and promotes cell proliferation Nat Commun 8:244.

      Wang J, Li BX, Ge PP, Li J, Wang Q, Gao GF, Qiu XB, Liu CH (2015) Mycobacterium tuberculosis suppresses innate immunity by coopting the host ubiquitin system Nat Immunol 16:237–245

      Weng Y, Shepherd D, Liu Y, Krishnan N, Robertson BD, Platt N, Larrouy-Maumus G, Platt FM (2022) Inhibition of the Niemann-Pick C1 protein is a conserved feature of multiple strains of pathogenic mycobacteria Nat Commun 13:5320.

      Xu X, Lu X, Dong X, Luo Y, Wang Q, Liu X, Fu J, Zhang Y, Zhu B, Ma X (2017) Effects of hMASP2 on the formation of BCG infection-induced granuloma in the lungs of BALB/c mice Sci Rep 7:2300.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03064

      Corresponding author(s): Massimo, Hilliard; Sean, Coakley

      1. General Statements

      We are grateful to the reviewers for taking time to review our manuscript and for providing such clear, insightful and actionable suggestions. The consensus between 4 independent reviewers that this story is of general interest to cell biologists, neurobiologists and clinical researchers is remarkable. In addition to our mechanistic insights into the regulation of GTPase activity, we think that the experimental systems we have developed will be of great value to study how GTPases their associated GAPs and GEFs function to maintain the nervous system, especially due to the demonstrated conservation of these molecules. We believe that our data provides a powerful and tractable model to study such molecules in a physiological context.

      We agree with the reviewers' concerns and propose the following plan below to address them.

      2. Description of the planned revisions

      Reviewer #1(Evidence, reproducibility and clarity (Required)):


      __Summary Stability of the PLM axon in C. elegans is maintained through interactions with the epidermis. Previous studies by this group found that loss of the tbc-10 Rab GTPase Activating Protein strongly enhanced the PLM axon break phenotype of unc-70/beta-spectrin mutants. TBC-10 is a GAP for RAB-35 and thus loss of rab-35 suppresses the tbc-10 phenotype. Of the two RAB-35 GEFs, loss of RME-4 partially suppressed the tbc-10 phenotype and FLCN-1 was not involved suggesting that there may be an additional GEF involved. Here Bonacossa-Pereira et al identify a point mutation in agef-1a (vd92) as a suppressor of tbc-10 PLM axon break phenotype (all experiments also have a dominant allele of unc-70) and confirm that point mutation is causative by replicating the mutation via genome editing (vd123). Rescue experiments demonstrate that AGEF-1a is required in the epidermis and not PLM as previous demonstrated with tbc-10 and unc-70. Rescue is dependent on a functional SEC7/GEF activity. AGEF-1a is a functional ortholog to human BIG2/ArfGEF2 as its expression fully rescues tbc-10. AGEF-1a functions upstream of RAB-35 as expression of activated RAB-35 can suppress loss of agef-1. AGEF-1a functions in parallel to RME-4 as the double has stronger suppression of tbc-10. AGEF-1a is an ARF GEF, however it functions independently of ARF-1.2 as loss of arf-1.2 does not suppress tbc-10. They demonstrate that AGEF-1a interacts with RAB-35 through colocalization experiments suggesting that AGEF-1a could directly activate RAB-35. Finally, they demonstrate that AGEF-1a regulates the localization of the LET-805 epidermal attached complex component as it restores localization in a tbc-10 mutant.

      Major comments

      The manuscript is well written and easy to understand.

      The experiments are well done and controlled.

      I enjoyed reading this paper. However...

      Some of the claims are not supported by the data.__

      __1) The claim that AGEF-1a directly interacts with RAB-35 was not demonstrated. The evidence provided to support a direct interaction are colocalization experiments in Figure 3. AGEF-1a does partially colocalize with RAB-35 in the epidermis. However, colocalization does not indicate a physical interaction direct or indirect. A simple fix would be to change the claim to that they partially colocalize. Optional, a physical interaction could be done with the split-GFP since they already have the AGEF-1 strain or they could perform co-IP experiments, though neither of those are proof of direct interactions.

      __

      We agree that the biochemical co-IP experiment could provide some answers, however, using a full length AGEF-1a would not only represent a significant technical challenge but will also not prove a direct interaction in a physiological context. To overcome this limitation, and to directly test their interaction in vivo, we propose to use a split-GFP approach as suggested by the reviewer. In this experiment, we will generate an endogenously tagged GFP1-10::rab-35 allele and combine it with the previously generated and available tagged agef-1a::GFP11x7. If AGEF-1 and RAB-35 closely interact, we should observe the reconstitution of full length GFP. It is possible that the endogenously tagged versions only provide a very weak GFP signal that will be difficult to detect. As an alternative approach, we will generate the same tagged molecules as overexpressed transgenes under epidermal-specific promoters (such as Pdpy-7). If the results are still negative, we agree to temper our claim that these molecules physically interact and rephrase the manuscript to reflect the new data.

      • *

      2) The claim that AGEF-1a facilitates RAB-35 activation is not supported. While it is likely that AGEF-1a facilitates RAB-35 activation based on the epistasis experiments as well as studies in mammalian cells there were no experiments to demonstrate that modulating AGEF-1a activity resulted in a change in RAB-35 activity. I would suggest tempering this claim to something along the line that the data are consistent with AGEF-1a regulating RAB-35 activity as shown in mammalian cells. An optional experiment would be to look at the colocalization of RAB-35 with a known effector in wild type and agef-1(vd92) with the expectation that there would be a higher level of colocalization in agef-1 mutants. Effector pull-down experiments or perhaps a cell based GEF assay could be used (PMID: 35196081).


      We welcome this suggestion and acknowledge the limitations of these experiments. While we might be able to determine if AGEF-1 and RAB-35 physically interact in vivo with the experiments proposed above, screening for the relevant rab-35 effector in this context and/or doing effector pull-down/cell based GEF assays would be a significant technical challenge. We propose to temper our claim as suggested.

      3) The claim that AGEF-1a functions independently of ARF-1.2 is not well supported. The fact that the ARF-1.2 mutant does not suppress tbc-10 suggests that ARF-1.2 may not be involved but does not eliminate the possibility that ARF-1.2 functions redundantly with ARF-5 or WARF-1/ARF-1.1. This can be resolved by toning down the claim. Alternatively, this can be tested by RNAi of arf-5 and warf-1 in tbc-10 and arf-1.2; tbc-10 mutants.

      We agree that warf-1 and arf-5 could be functioning redundantly with arf-1.2. We have attempted to generate an AID::arf-5 allele to test the effect of cell-specific degradation, but homozygous AID::arf-5 animals were lethal. We have not yet examined warf-1. We believe the best way to test these two molecules is through RNAi knockdown, and we propose to do this experiment and adjust our interpretation and discussion according to the new data.

      Minor comments

      Figure 1C the CRISPR generated allele (vd123) is referred to as [S784L] and then in 1E vd92 is referred to as [S784L]. Perhaps it would be clearer if the allele name was used instead of the amino acid change.

      We will reformat the manuscript to include the allele names instead of amino acid change.

      Page 6 "We reasoned that if the S784L mutation we isolated causes a similar loss of the GTPase activation function, then SKIN::AGEF-1a[E608K] would not have the capacity to restore the rate of PLM axon breaks to background levels in agef-1[S784L]; tbc-10; vdSi2 animals." It was unclear to me whether you were testing if the S784L mutation could be disrupting a GEF independent function or might disrupt the nucleotide exchange activity as might be tested in a biochemical assay. There are many reasons this change could cause a loss of function phenotype (ie. Improper folding, mislocalization, etc.). The most clear explanation would be that you were testing if GEF function was required for rescue rather than testing if the S784L mutation disrupted GEF activity.

      Indeed, this experiment reveals that reducing the activation of the AGEF-1 target phenocopies the effect of S784L and does not further enhance the effect of S784L. However, it does not answer if, specifically, the GEF function is affected by S784L. We propose to rewrite the quoted sentence as follows: "We asked whether the GEF function is required for axonal damage. If that is the case, then SKIN::AGEF-1a[E608K] overexpression should phenocopy the effect of AGEF-1a[S784L]."

      • *

      Page 13. It was unclear how testing if AGEF-1, RME-4, ARF-5 and RAB-35 form complexes in vivo (I assume you are suggesting colocalize based on figure 3 interpretation) would resolve how AGEF-1 was regulating RAB-35.


      We apologize that our phrasing was not clear. We will rewrite this section to better reflect the following idea. Given literature data showing an allosteric interaction between RME-4/DENND1 and ARF-5/Arf5, and our own data showing that AGEF-1 regulates RAB-35, we believe these molecules could form a complex. Considering that we do not have data to support this notion, mostly due to the inability to test the effect of ARF-5, we will present this possibility in the discussion section.


      __**Cross-commenting**

      I agree with the comments made by the other reviewers and I stand by my own as well. I will echo that it is important to know the nature of their agef-1 allele.

      Reviewer #1 (Significance (Required)):

      Bonacossa-Pereira et al identify AGEF-1 as a regulator of axon integrity that functions in a pathway with RAB-35 in the epidermis is an exciting finding. As pointed out in the discussion, mutations in the human ortholog cause neurodevelopmental defects which leads to obvious characterization of BIG2/ArfGEF2 in neurons while this study indicates that this protein can have cell non-autonomous roles in regulating neurons. These findings could have important implications for understanding the etiology of these defects that would be of interest to neurobiologists and clinical researchers.

      The finding of this paper would also be of interest to cell biologists and particularly those studying the roles of Rab and Arf GTPases in membrane trafficking, such as myself. The idea that AGEF-1 might function as a Rab35 GEF is provocative and would generate a lot of interest and skepticism from the field. However, there is no data to support that AGEF-1 would be a direct regulator of Rab35 over the previously demonstrated cross regulation of Rab35 by Arf GTPases. Therefore, it would be fine to speculate in the discussion a direct interaction, but I would refrain from suggesting this as a model and elsewhere in the manuscript.

      __

      Although we agree that current evidence is not sufficient to support the model where AGEF-1 is a direct regulator of RAB-35, our data points to the direction where there is an important genetic relationship between these molecules in a physiological context in a living animal, with a defined phenotype relevant to the nervous system maintenance. We think that the proposed revision experiments will provide a better understanding of how AGEF-1 functions with RAB-35 and we agree with the suggestion to rephrase our manuscript to reflect the limitations of our results.


      __Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This interesting manuscript reports the outcome of a fruitful C. elegans genetic screen with a complex but clever design. Through it, the authors identify AGEF-1 as a GEF that likely regulates the active state of the GTPase RAB-35 in the skin to protect touch receptor axons from mechanical breakage.

      Major points: 1. Based on localization experiments, the authors claim "AGEF-1a interacts with RAB-35 in the epidermis" (Results heading) and state "these data demonstrate that AGEF-1a interacts with a subset of RAB-35 molecules in the epidermis." In general, localization studies cannot be used to conclude physical interaction (with some exceptions such as single-molecule kinetics). In this case, the data in my view do not even make a compelling argument for co-localization. There is a lot of AGEF-1 and RAB-35 signal everywhere and it may not be meaningful that the signals sometimes overlap. A more quantitative approach with controls would be needed to conclude meaningful co-localization. Importantly, this would still not demonstrate interaction.__

      We thank the reviewer for the comment. Indeed, co-localization does prove a physical interaction, and we appreciate the concern about our imaging data not making a compelling argument. To address this notion, we plan to perform an experiment using a more robust, quantitative and physiologically relevant strategy. We will generate an endogenously tagged mScarlet3::rab-35 allele for precise endogenous localization. In addition, as a positive control, we will generate an endogenous rme-4::GFP11x7 allele to cell-specifically demonstrate the level of colocalization of RME-4 with mScarlet3::RAB-35 within the epidermis. To address the possible interaction between AGEF-1a and RAB-35 we will leverage a split-GFP approach to assess their interaction in vivo, in the context relevant to the phenotypes we observed (see reply to reviewer #1 point 1).

      __2. The effect of the AGEF-1(S784L) mutation is not clear to me. Naively, as the S784L mutation lies in the auto-inhibitory domain, I would have expected AGEF-1 to become constitutively active, not inactive as the authors seem to suggest. Is the idea that it is constitutively auto-inhibited? The main evidence for a loss of function effect seems to be that a putative dominant negative mutation AGEF-1(E608K) does not further supress axon breakage when co-expressed in trans to AGEF(S784L), but in my view this only shows that, once the defect is suppressed, it cannot be suppressed any further. Defining the nature of the S784L allele is important. Some suggestions, although the authors may come up with different approaches: use of an inducible or cell-specific depletion system like AID/TIR1, Cre/lox, or FLP/FRT to circumvent the lethality of agef-1(0) and reveal what a true loss-of-function looks like; testing if deletion of the auto-inhibitory domain phenocopies S784L to test if this mutation impairs autoinhibition.

      __

      This is an very insightful comment. To address this point, we will follow the reviewer's suggestion and deplete AGEF-1 cell-specifically in the epidermis using the auxin-inducible degron system. Specifically, we will generate an agef-1::AID allele to degrade this molecule in a spatially and temporally controlled fashion, which will allow to circumvent the lethality of agef-1(0) and determine whether the S784L allele mimics the depletion of AGEF-1.

      Although it would be interesting to further dissect the effect of this mutation on AGEF-1 activity, we believe that this falls outside of the scope of this manuscript. As an alternative, we propose to elaborate more in the discussion the implications of the possible roles for the S784L mutation to clarify our model of its function. Our data supports a model in which this mutation reduces AGEF-1 function leading to a reduction in the activity of its downstream target GTPases. It is possible that this is due to AGEF-1 becoming constitutively autoinhibited, or that this mutation affects the structure of the molecule in a way that it reduces its affinity towards its downstream effectors.

      Minor points: 1. I am not able to see the "vesicle-like structures with a clear luminal space" or RAB-35 being "notably enriched at the membrane near the epidermal furrow" in Fig. 3. The "3D surface rendering" in Fig. 3e is grossly oversampled and should not be included.

      We will rectify this section and include new super-resolved images using Airyscan confocal microscopy. We hope these will yield a better-quality representation of these concepts. __ 2. As the agef-1a isoform is specifically referenced throughout, please describe the different agef-1 isoforms somewhere to save readers from having to look this up.__

      Yes, we will include a description of the isoforms. In C. elegans there are two: AGEF-1a which has been confirmed by cDNA and AGEF-1b which is predicted and partially confirmed by cDNA. The mutation we isolated exclusively affects AGEF-1a.

      3. The authors include an interesting speculation in the Discussion: "Future investigations of BIG2-associated neurological disorders should consider... hyper-activity of BIG2 as a driver of neuropathology." If the authors have the tools to test the effect of hyperactive BIG2 in this system, it could be an exciting addition.


      This is an exciting idea that we would like to keep in the Discussion. The biology of BIG2 activity regulation is a nascent field of research and we believe that to accurately generate and characterise a hyperactive BIG2 would be beyond the scope of this manuscript.

      __ On a personal note, since GEFs act oppositely to GTPase Activating Proteins (GAPs), I had to stop and re-read carefully whenever the authors referred to a GEF "activating" a GTPase. I understand their meaning (i.e., putting the GTPase in its active GTP-bound state, not activating its GTPase function) but I wanted to point out this potential confusion in case there is a way to better define terms in the Introduction or change word choice. I realize this may be a standard jargon in the field.__

      Indeed, this is confusing nomenclature and a difficult concept to deliver in an accurate and succinct manner. We propose to include a clearer, more didactic explanation of their function. In a simple explanation, GTPases perform cellular functions when bound to GTP. GAPs terminate GTPase activity by catalysing GTP hydrolysis, generating GDP. GEFs initiate GTPase activity by catalysing the release of GDP and allowing GTP binding.

      __ Please check the correct nomenclature for CRISPR/Cas9.__


      We will rectify where appropriate.

      __6. p.7 "these molecules act in synergy", consider replacing with "redundantly".

      __

      We will rectify where appropriate.

      __Reviewer #2 (Significance (Required)):

      The significance of this story is to show that GEF-GTPases pairing can be highly context-dependent. Previous studies have identified GEFs that pair with RAB-35 and GTPases that pair with AGEF-1, but the authors find that these factors have at best a modest role in the context of skin-axon interactions. Instead, the authors suggest a novel GTPase-GEF pairing of RAB-35 with AGEF-1 and provide evidence that this relationship is conserved in the human homolog of AGEF-1. These results suggest that GTPase-GEF pairings depend not only on chemical affinity but also cellular context.

      The main strength of the study is its clever genetics. For the screen, the authors looked for suppressors of a synthetic defect in axon integrity caused in part by elevated activity of RAB-35 due to loss of its GAP TBC-10. It is satisfying that this screen isolated a mutation in a GEF that in principle could counterbalance the loss of a GAP.

      The main weakness of the study is the lack of direct evidence for an AGEF-1/RAB-35 interaction. While not necessary for publication, the inclusion of biochemical data to support the role of AGEF-1 as a GEF for RAB-35 and the effect of the S784L mutation on this activity would strongly elevate the study. The genetic data for this interaction are consistent with the model but not conclusive, and in my view the colocalization data are not compelling. Nevertheless this is a solid genetic story with a clever screen.__

      __ __We appreciate the feedback and are grateful for the positive comments on the significance of our study. As explained in the significance section related to Reviewer 1, if we find evidence of a direct interaction between AGEF-1 and RAB-35 in the proposed new experiments, we will include it in the manuscript; alternatively, we will present it as a possibility in the discussion section, as suggested. We agree that a more nuanced understanding of the effect of the S784L is interesting and that our colocalization data can be improved, and we have proposed experiments to address these concerns.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper investigates the mechanism by which molecular pathways in the skin protect the processes of nerves that innervate them from damage. The authors previously showed that spectrin and the small GTPase RAB-35 act in the epidermis of C. elegans to protect mechanosensory axons from breaking. In this paper they used a suppression screen to identify another gene involved in this process, an ARF-GEF called AGEF-1. Partial loss-of-function mutations in agef-1 suppress the axon-breakage phenotype of spectrin mutations, and genetic experiments by the authors are consistent with the possibility that AGEF-1 could act directly as an exchange factor for RAB-35. Consistent with this model, they show that AGEF-1 and RAB-35 colocalise in the skin.

      Major comments: The experiments in this paper are well-designed and well-controlled, and the interpretations of the results are all reasonable. On the other hand, I don't think the authors' hypothesis that AGEF-1 acts directly as an exchange factor for RAB-35, or that these two proteins directly interact, is definitively proven. This is not an issue of the authors overinterpreting their data--the paper is very carefully and thoughtfully written. However, the most interesting and counterintuitive finding--that an ARF-GEF could also be a RAB-GEF--might be strengthened with more experiments (for example, could they more directly show protein-protein interaction through co-IP or mass spec?).__

      We thank the reviewer for the suggestion. We propose to further investigate the notion that AGEF-1a might be a direct interactor of RAB-35 using a split-GFP approach to assess whether these molecules closely interact, in vivo, in the physiological context that is relevant for the maintenance of the touch sensing neurons (please see reply to reviewer #1 major point 1 and reviewer #2 major point 1 for more details).

      Minor comments: There are also two places where the fact that null mutations are lethal (for agef-1 and arf-5) prevented the authors from addressing the effect of agef-1 loss of function in the skin, and addressing whether ARF-5 could be an AGEF-1 target, respectively. In principle, they could have tried to make a CRISPR line in which these genes could be cell-specifically deleted in the skin (using a dpy-7-driven recombinase). I don't think either of these experiments are essential, but if it is feasible to make these lines it would tie up a couple of loose ends.

      We agree to explore the roles of agef-1 and arf-5 loss-of-function. We propose to tissue-specifically degrade agef-1 using an auxin-inducible degradation strategy (please see reviewer #2 major point 2 reply for more details). For arf-5, we propose knocking-down its function using RNAi to overcome lethality (please see reviewer #1 major point 3 reply for more details).

      __Reviewer #3 (Significance (Required)):

      Overall I think this is an interesting paper on a topic of general interest. The most interesting finding is that an exchange factor for an ARF (a small GRPase involved in vesicle coating/uncoating) could also be an exchange factor for a RAB (a small GTPase involved in vesicle tethering). The evidence presented is suggestive and intriguing, though as noted above not completely definitive. In summary, I think it is an interesting paper in its current form, and anything it could do to more firmly establish a direct interaction between AGEF-1 and RAB-35 would increase its impact and importance.

      __

      We thank the reviewer for the positive evaluation of the significance of our study.

      __ Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study Bonacossa-Pereira et al. identify AGEF-1a, an Arf-GEF, as a factor that functions in the epidermis through RAB-35 to regulate axonal integrity of the PLM mechanosensory neurons in C. elegans. Specifically, epidermal attachment sites are regulated by these genes form the epidermis and compromising these attachment sites results in axonal degeneration. The study provides some evidence that that RAB-35 and AGEF-1 at least partially colocalize in the skin. Finally, the authors provide evidence that the human orthologue BIG2 is capable of functionally replacing AGEF-1a in C. elegans. Overall, the experiments are well designed and the paper is clear and succinct. The conclusions are supported by the findings and provide an important extension of the author's findings a few back, when they identified the role of rab-35 in mediating the epidermal-neuronal attachment sites.

      Major comments: 1. AGEF-1/BIG2 are known to regulate other GTPases such as ARF-5 or ARF-2. The authors exclude a non-redundant function for ARF-2, but are unable to establish a role for ARF-5 because of the lethality associated with the mutation. Alternative approaches, such as cell specific knock out or knock down experiment. In addition, studies to test potentially physical interaction such as pull-down assays, co-IP experiments and FRET could be used to test whether AGEF-can bind RAB-35 or ARF-5.__

      We thank the reviewer for this suggestion. We propose addressing these concerns using a tissue-specific degradation for AGEF-1a (please see reviewer #1 major point 2 for details). To establish a role for ARF-5 we propose to do an RNAi mediated knock-down to overcome lethality (please see reviewer #1 major point 3 for details). Finally, we plan to use a split-GFP approach to test the physical interaction between agef-1a and rab-35 in vivo (please see reviewer #1 major point 1 for details)

      __ Phenotypic readout has been limited to only axon breaks. It may be interesting to also test other aspects such as axonal deformities including swellings and vesiculation in other parts of the nervous system. Moreover, behavioral or functional experiments such as response to gentle touch or synaptic integrity could be informative.__

      We have not observed any obvious touch receptor neurons axonal phenotypes other than axonal breaks in these mutants, and we will include a statement that reflects this concept. In relation to the behavior, we have not tested it as the results will be difficult to interpret for two reasons: first, the breaks are not always bilateral and one neuron is sufficient to provide mechanical response; second, the mixed identity of the PLM neurite allows it to retain some function despite being severed. However, if deemed essential, we will perform these experiments.

      __ Overexpression constructs such as SKIN::RAB-35[Q69L], SKIN::BIG2, SKIN::AGEF-1a[E608K] in extrachromosomal transgenes could lead to non-physiological localization or effects. Single copy expression using MosSCI or CRISPR insertions are generally considered better approaches (other than endogenous reporters) to provide accurate insights at the physiological level. While the authors tacitly acknowledge this by conducting the experiments in a rab-35 mutant background and very low transgene concentration, at the very least this caveat regarding the localization should be discussed.__

      This is an important remark, and we appreciate the comment. We acknowledge that experiments using extrachromosomal arrays have inherent caveats, especially for localization studies. To address the RAB-35 localization concern we plan to repeat the localization studies using an endogenously tagged RAB-35 using CRISPR to overcome the possible artifacts caused by extrachromosomal array driven expression (please see reviewer #1 point 1 for more details). For the cell-specific rescues or dominant-negative constructs expression, we believe that using extrachromosomal arrays is sufficient, since this allows us to compare genetically identical transgenic vs non-transgenic siblings of independent lines. Moreover, given these constructs are already driven by a tissue-specific promoter that is inherently stronger than their respective endogenous promoters, even a single-copy insertion would have the same caveats.

      __4. The study does not address clearly whether AGEF-1a acts in parallel to spectrin or upstream/ downstream to it. Epistasis experiments could help to figure out the signaling pathway involved.

      __

      Indeed, this is a concept that we need to communicate more clearly. We have data showing that a mutation in agef-1 does not cause axonal damage on its own, and that it has no effect on the axonal damage caused by unc-70 dominant negative mutation alone. We only detect an effect of agef-1 when tbc-10 is mutated together with unc-70 (Fig. 1a of manuscript). Together, these data indicate that agef-1 functions upstream of rab-35, thus acting in parallel to unc-70 (see schematic below) to ensure the mechanical stability of neuron epidermal attachment. We plan to include this data and the following schematic as a supplement to better convey the idea and discuss the results appropriately.

      __ The finding that BIG2 rescues the mutant defect is an important finding and rightfully finds its place in the abstract. I wonder whether a reference to the human diseases caused by loss of BIG2 in the abstract and introduction would not increase interest/impact for the study, rather than burying this potentially interesting connection in the discussion.

      __

      We appreciate the reviewer's comment, and welcome the suggestion. We propose to include relevant background about BIG2-related human diseases in the abstract and introduction as suggested and expand the discussion regarding BIG2 mutations.

      __Minor comments:

      1. Some explanation about how mutating the autoinhibitory domain could impact the catalytic activity of a GEF might be helpful.__

      2. *

      We acknowledge that this notion was not well communicated. We propose to elaborate more about why we think a mutation in the autoinhibitory domain might be affecting the GEF activity and we plan to do further experiments to dissect how this might be happening. Please see reviewer #2 major point 2 for a more detailed explanation.

      __ The paper refers to rme-4(b1001) as a null allele while wormbase refers to the same as a missense allele. It would be more accurate to refer rme-4(b1001) as a strong loss of function or putative null.__

      We agree and will refer to b1001 as a strong loss-of-function.

      __ The paper does not clearly discuss limitations of the hypomorphic agef-1[S784L] and that the observed phenotypes in this hypomorph might underestimate the complete role of AGEF-1a.__

      • *

      We thank the reviewer for this suggestion. We propose to elaborate more on these limitations, especially considering the possible new results from the experiments suggested in reply to reviewer #2 major comment point 2.

      __ In figure 1, where there really only one extrachromosomal transgenic line for some of the construct tested? __

      • *

      For the Pdpy-7::AGEF-1a lines we have scored 3 transgenic lines (data not included) and only one yielded a full rescue. For all extrachromosomal lines presented, we tested 3 independent transgenic lines. For brevity, we only included the result for the positive rescues (1 for BIG2 and 1 for AGEF-1a), except for the Pmec-4 lines, of which none rescued the phenotype (data included in Table S2). We will update Table S2 to include all the lines tested.

      __ The concentrations of transgenes vary in different transgenes. Is there a rationale behind this? __

      Yes, we have attempted multiple concentrations of injections for each transgene and there was some variability for each construct injected, thus we only included the ones where we observed an effect. As mentioned in point 4 above, we will update Table S2 to include details of all lines tested.

      __ In Fig.1e: I may be useful to also show the "WT" phenotype, i.e. the strong defects to get a visual comparison for the degree of rescue. __

      • *

      We think this suggestion will help the readers. We will include this as a representative dashed line showing the WT phenotype.

      __Reviewer #4 (Significance (Required)):

      The study has identified AGEF-1a as a regulator of axonal maintenance, functioning to protect neurons against mechanical stress by acting through RAB-35. Additionally, this epidermal GEF, AGEF-1a is functionally conserved as its human orthologue BIG2 can replace AGEF-1a in C. elegans for axonal protection. Important points here are that the findings extend prior work by the authors of non-autonomous mechanism that regulates epidermal-neuronal attachment. In my humble opinion, the human disease connection, in particular with regard to the unexplained neuronal phenotypes in patients could be better developed in the manuscript. It may also increase impact/interest of a wonderful story that right now reads a bit 'wormy'.__


      This is an important remark and we are grateful for the positive comments. The fact that human BIG2 is also conserved in C. elegans points to a fundamental role of this molecule in multicellular life, and it provides a tractable model to investigate the function of this molecule in a physiological context. We welcome the suggestion to elaborate more the connection with the unexplained neuronal phenotypes in patients and use a more accessible language to convey our findings to a wider audience.


      3. Description of the revisions that have already been incorporated in the transferred manuscript

      N/A

      4. Description of analyses that authors prefer not to carry out

      __Reviewer #1 __


      "...studies to test potentially physical interaction such as pull-down assays, co-IP experiments and FRET could be used to test whether AGEF-can bind RAB-35 or ARF-5."


      While pull-down assays, co-IP and FRET would reveal whether AGEF-1a can form a complex with RAB-35, we believe that using a full length AGEF-1a would not only represent a significant technical challenge but will also not prove a direct interaction in a physiological context.


      "...An optional experiment would be to look at the colocalization of RAB-35 with a known effector in wild type and agef-1(vd92) with the expectation that there would be a higher level of colocalization in agef-1 mutants. Effector pull-down experiments or perhaps a cell based GEF assay could be used (PMID: 35196081)."


      We think that screening for the relevant rab-35 effector in this context and/or doing effector pull-down/cell based GEF assays would be a significant technical challenge. We propose to address this concern by tempering our claim as suggested by the reviewer.


      "...It may be interesting to also test other aspects such as axonal deformities including swellings and vesiculation in other parts of the nervous system. Moreover, behavioral or functional experiments such as response to gentle touch or synaptic integrity could be informative."

      As indicated above in major point 2 of reviewer 4, these are interesting ideas that might answer how the function of these neurons might be affected. However, in addition to the challenges indicated above, they will not provide further insights into how their integrity is maintained. We believe these will fall outside the scope of the manuscript, but if deemed essential we will perform behavioral analysis.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      In this study Bonacossa-Pereira et al. identify AGEF-1a, an Arf-GEF, as a factor that functions in the epidermis through RAB-35 to regulate axonal integrity of the PLM mechanosensory neurons in C. elegans. Specifically, epidermal attachment sites are regulated by these genes form the epidermis and compromising these attachment sites results in axonal degeneration. The study provides some evidence that that RAB-35 and AGEF-1 at least partially colocalize in the skin. Finally, the authors provide evidence that the human orthologue BIG2 is capable of functionally replacing AGEF-1a in C. elegans. Overall, the experiments are well designed and the paper is clear and succinct. The conclusions are supported by the findings and provide an important extension of the author's findings a few back, when they identified the role of rab-35 in mediating the epidermal-neuronal attachment sites.

      Major comments:

      1. AGEF-1/BIG2 are known to regulate other GTPases such as ARF-5 or ARF-2. The authors exclude a non-redundant function for ARF-2, but are unable to establish a role for ARF-5 because of the lethality associated with the mutation. Alternative approaches, such as cell specific knock out or knock down experiment. In addition, studies to test potentially physical interaction such as pull-down assays, co-IP experiments and FRET could be used to test whether AGEF-can bind RAB-35 or ARF-5.
      2. Phenotypic readout has been limited to only axon breaks. It may be interesting to also test other aspects such as axonal deformities including swellings and vesiculation in other parts of the nervous system. Moreover, behavioral or functional experiments such as response to gentle touch orsynaptic integrity could be informative.
      3. Overexpression constructs such as SKIN::RAB-35[Q69L], SKIN::BIG2, SKIN::AGEF-1a[E608K] in extrachromosomal transgenes could lead to non-physiological localization or effects. Single copy expression using MosSCI or CRISPR insertions are generally considered better approaches (other than endogenous reporters) to provide accurate insights at the physiological level. While the authors tacitly acknowledge this by conducting the experiments in a rab-35 mutant background and very low transgene concentration, at the very least this caveat regarding the localization should be discussed.
      4. The study does not address clearly whether AGEF-1a acts in parallel to spectrin or upstream/ downstream to it. Epistasis experiments could help to figure out the signaling pathway involved.
      5. The finding that BIG2 rescues the mutant defect is an important finding and rightfully finds its place in the abstract. I wonder whether a reference to the human diseases caused by loss of BIG2 in the abstract and introduction would not increase interest/impact for the study, rather than burying this potentially interesting connection in the discussion.

      Minor comments:

      1. Some explanation about how mutating the autoinhibitory domain could impact the catalytic activity of a GEF might be helpful.
      2. The paper refers to rme-4(b1001) as a null allele while wormbase refers to the same as a missense allele. It would be more accurate to refer rme-4(b1001) as a strong loss of function or putative null.
      3. The paper does not clearly discuss limitations of the hypomorphic agef-1[S784L] and that the observed phenotypes in this hypomorph might underestimate the complete role of AGEF-1a.
      4. In figure 1, where there really only one extrachromosomal transgenic line for some of the construct tested?
      5. The concentrations of transgenes vary in different transgenes. Is there a rationale behind this?
      6. In Fig.1e: I may be useful to also show the "WT" phenotype, i.e. the strong defects to get a visual comparison for the degree of rescue.

      Significance

      The study has identified AGEF-1a as a regulator of axonal maintenance, functioning to protect neurons against mechanical stress by acting through RAB-35. Additionally, this epidermal GEF, AGEF-1a is functionally conserved as its human orthologue BIG2 can replace AGEF-1a in C. elegans for axonal protection. Important points here are that the findings extend prior work by the authors of non-autonomous mechanism that regulates epidermal-neuronal attachment. In my humble opinion, the human disease connection, in particular with regard to the unexplained neuronal phenotypes in patients could be better developed in the manuscript. It may also increase impact/interest of a wonderful story that right now reads a bit 'wormy'.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This interesting manuscript reports the outcome of a fruitful C. elegans genetic screen with a complex but clever design. Through it, the authors identify AGEF-1 as a GEF that likely regulates the active state of the GTPase RAB-35 in the skin to protect touch receptor axons from mechanical breakage.

      Major points:

      1. Based on localization experiments, the authors claim "AGEF-1a interacts with RAB-35 in the epidermis" (Results heading) and state "these data demonstrate that AGEF-1a interacts with a subset of RAB-35 molecules in the epidermis." In general, localization studies cannot be used to conclude physical interaction (with some exceptions such as single-molecule kinetics). In this case, the data in my view do not even make a compelling argument for co-localization. There is a lot of AGEF-1 and RAB-35 signal everywhere and it may not be meaningful that the signals sometimes overlap. A more quantitative approach with controls would be needed to conclude meaningful co-localization. Importantly, this would still not demonstrate interaction.
      2. The effect of the AGEF-1(S784L) mutation is not clear to me. Naively, as the S784L mutation lies in the auto-inhibitory domain, I would have expected AGEF-1 to become constitutively active, not inactive as the authors seem to suggest. Is the idea that it is constitutively auto-inhibited? The main evidence for a loss of function effect seems to be that a putative dominant negative mutation AGEF-1(E608K) does not further supress axon breakage when co-expressed in trans to AGEF(S784L), but in my view this only shows that, once the defect is suppressed, it cannot be suppressed any further. Defining the nature of the S784L allele is important. Some suggestions, although the authors may come up with different approaches: use of an inducible or cell-specific depletion system like AID/TIR1, Cre/lox, or FLP/FRT to circumvent the lethality of agef-1(0) and reveal what a true loss-of-function looks like; testing if deletion of the auto-inhibitory domain phenocopies S784L to test if this mutation impairs autoinhibition.

      Minor points:

      1. I am not able to see the "vesicle-like structures with a clear luminal space" or RAB-35 being "notably enriched at the membrane near the epidermal furrow" in Fig. 3. The "3D surface rendering" in Fig. 3e is grossly oversampled and should not be included.
      2. As the agef-1a isoform is specifically referenced throughout, please describe the different agef-1 isoforms somewhere to save readers from having to look this up.
      3. The authors include an interesting speculation in the Discussion: "Future investigations of BIG2-associated neurological disorders should consider... hyper-activity of BIG2 as a driver of neuropathology." If the authors have the tools to test the effect of hyperactive BIG2 in this system, it could be an exciting addition.
      4. On a personal note, since GEFs act oppositely to GTPase Activating Proteins (GAPs), I had to stop and re-read carefully whenever the authors referred to a GEF "activating" a GTPase. I understand their meaning (i.e., putting the GTPase in its active GTP-bound state, not activating its GTPase function) but I wanted to point out this potential confusion in case there is a way to better define terms in the Introduction or change word choice. I realize this may be a standard jargon in the field.
      5. Please check the correct nomenclature for CRISPR/Cas9.
      6. p.7 "these molecules act in synergy", consider replacing with "redundantly".

      Significance

      The significance of this story is to show that GEF-GTPases pairing can be highly context-dependent. Previous studies have identified GEFs that pair with RAB-35 and GTPases that pair with AGEF-1, but the authors find that these factors have at best a modest role in the context of skin-axon interactions. Instead, the authors suggest a novel GTPase-GEF pairing of RAB-35 with AGEF-1 and provide evidence that this relationship is conserved in the human homolog of AGEF-1. These results suggest that GTPase-GEF pairings depend not only on chemical affinity but also cellular context.

      The main strength of the study is its clever genetics. For the screen, the authors looked for suppressors of a synthetic defect in axon integrity caused in part by elevated activity of RAB-35 due to loss of its GAP TBC-10. It is satisfying that this screen isolated a mutation in a GEF that in principle could counterbalance the loss of a GAP.

      The main weakness of the study is the lack of direct evidence for an AGEF-1/RAB-35 interaction. While not necessary for publication, the inclusion of biochemical data to support the role of AGEF-1 as a GEF for RAB-35 and the effect of the S784L mutation on this activity would strongly elevate the study. The genetic data for this interaction are consistent with the model but not conclusive, and in my view the colocalization data are not compelling. Nevertheless this is a solid genetic story with a clever screen.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Stability of the PLM axon in C. elegans is maintained through interactions with the epidermis. Previous studies by this group found that loss of the tbc-10 Rab GTPase Activating Protein strongly enhanced the PLM axon break phenotype of unc-70/beta-spectrin mutants. TBC-10 is a GAP for RAB-35 and thus loss of rab-35 suppresses the tbc-10 phenotype. Of the two RAB-35 GEFs, loss of RME-4 partially suppressed the tbc-10 phenotype and FLCN-1 was not involved suggesting that there may be an additional GEF involved. Here Bonacossa-Pereira et al identify a point mutation in agef-1a (vd92) as a suppressor of tbc-10 PLM axon break phenotype (all experiments also have a dominant allele of unc-70) and confirm that point mutation is causative by replicating the mutation via genome editing (vd123). Rescue experiments demonstrate that AGEF-1a is required in the epidermis and not PLM as previous demonstrated with tbc-10 and unc-70. Rescue is dependent on a functional SEC7/GEF activity. AGEF-1a is a functional ortholog to human BIG2/ArfGEF2 as its expression fully rescues tbc-10. AGEF-1a functions upstream of RAB-35 as expression of activated RAB-35 can suppress loss of agef-1. AGEF-1a functions in parallel to RME-4 as the double has stronger suppression of tbc-10. AGEF-1a is an ARF GEF, however it functions independently of ARF-1.2 as loss of arf-1.2 does not suppress tbc-10. They demonstrate that AGEF-1a interacts with RAB-35 through colocalization experiments suggesting that AGEF-1a could directly activate RAB-35. Finally, they demonstrate that AGEF-1a regulates the localization of the LET-805 epidermal attached complex component as it restores localization in a tbc-10 mutant.

      Major comments

      The manuscript is well written and easy to understand.

      The experiments are well done and controlled.

      I enjoyed reading this paper. However...

      Some of the claims are not supported by the data.

      1. The claim that AGEF-1a directly interacts with RAB-35 was not demonstrated. The evidence provided to support a direct interaction are colocalization experiments in Figure 3. AGEF-1a does partially colocalize with RAB-35 in the epidermis. However, colocalization does not indicate a physical interaction direct or indirect. A simple fix would be to change the claim to that they partially colocalize. Optional, a physical interaction could be done with the split-GFP since they already have the AGEF-1 strain or they could perform co-IP experiments, though neither of those are proof of direct interactions.
      2. The claim that AGEF-1a facilitates RAB-35 activation is not supported. While it is likely that AGEF-1a facilitates RAB-35 activation based on the epistasis experiments as well as studies in mammalian cells there were no experiments to demonstrate that modulating AGEF-1a activity resulted in a change in RAB-35 activity. I would suggest tempering this claim to something along the line that the data are consistent with AGEF-1a regulating RAB-35 activity as shown in mammalian cells. An optional experiment would be to look at the colocalization of RAB-35 with a known effector in wild type and agef-1(vd92) with the expectation that there would be a higher level of colocalization in agef-1 mutants. Effector pull-down experiments or perhaps a cell based GEF assay could be used (PMID: 35196081).
      3. The claim that AGEF-1a functions independently of ARF-1.2 is not well supported. The fact that the ARF-1.2 mutant does not suppress tbc-10 suggests that ARF-1.2 may not be involved but does not eliminate the possibility that ARF-1.2 functions redundantly with ARF-5 or WARF-1/ARF-1.1. This can be resolved by toning down the claim. Alternatively, this can be tested by RNAi of arf-5 and warf-1 in tbc-10 and arf-1.2; tbc-10 mutants.

      Minor comments

      Figure 1C the CRISPR generated allele (vd123) is referred to as [S784L] and then in 1E vd92 is referred to as [S784L]. Perhaps it would be clearer if the allele name was used instead of the amino acid change.

      Page 6 "We reasoned that if the S784L mutation we isolated causes a similar loss of the GTPase activation function, then SKIN::AGEF-1a[E608K] would not have the capacity to restore the rate of PLM axon breaks to background levels in agef-1[S784L]; tbc-10; vdSi2 animals." It was unclear to me whether you were testing if the S784L mutation could be disrupting a GEF independent function or might disrupt the nucleotide exchange activity as might be tested in a biochemical assay. There are many reasons this change could cause a loss of function phenotype (ie. Improper folding, mislocalization, etc.). The most clear explanation would be that you were testing if GEF function was required for rescue rather than testing if the S784L mutation disrupted GEF activity.

      Page 13. It was unclear how testing if AGEF-1, RME-4, ARF-5 and RAB-35 form complexes in vivo (I assume you are suggesting colocalize based on figure 3 interpretation) would resolve how AGEF-1 was regulating RAB-35.

      Cross-commenting

      I agree with the comments made by the other reviewers and I stand by my own as well. I will echo that it is important to know the nature of their agef-1 allele.

      Significance

      Bonacossa-Pereira et al identify AGEF-1 as a regulator of axon integrity that functions in a pathway with RAB-35 in the epidermis is an exciting finding. As pointed out in the discussion, mutations in the human ortholog cause neurodevelopmental defects which leads to obvious characterization of BIG2/ArfGEF2 in neurons while this study indicates that this protein can have cell non-autonomous roles in regulating neurons. These findings could have important implications for understanding the etiology of these defects that would be of interest to neurobiologists and clinical researchers.

      The finding of this paper would also be of interest to cell biologists and particularly those studying the roles of Rab and Arf GTPases in membrane trafficking, such as myself. The idea that AGEF-1 might function as a Rab35 GEF is provocative and would generate a lot of interest and skepticism from the field. However, there is no data to support that AGEF-1 would be a direct regulator of Rab35 over the previously demonstrated cross regulation of Rab35 by Arf GTPases. Therefore, it would be fine to speculate in the discussion a direct interaction, but I would refrain from suggesting this as a model and elsewhere in the manuscript.

    1. # True probabilities bp = c(2, 0, 2, -2)

      This is confusing. How are these probabilities? you mean parameters? but the parameters beta in the text were different: -2, 0, 2, 4.

  5. www.assemblee-nationale.fr www.assemblee-nationale.fr
    1. DOCUMENT DE SYNTHÈSE : Les Politiques d'Accompagnement à la Parentalité en France

      Source : Rapport d’information N° 1638, Assemblée Nationale, Délégation aux droits des femmes et à l’égalité des chances entre les hommes et les femmes, sur les politiques d’accompagnement à la parentalité, présenté par Mme Sarah Legrain et Mme Delphine Lingemann, enregistré le 24 juin 2025.

      Synthèse Exécutive

      Ce rapport de la Délégation aux droits des femmes et à l’égalité des chances entre les hommes et les femmes met en lumière les inégalités persistantes dans la répartition des charges domestiques et parentales en France, majoritairement assumées par les femmes.

      Il révèle que la parentalité, loin d'être neutre en matière de genre, est une cause majeure des inégalités économiques, professionnelles et sociales entre les hommes et les femmes. La "pénalité parentale" affecte de manière significative la carrière et les revenus des femmes, tandis que les hommes en sont largement épargnés.

      Les rapporteures identifient plusieurs axes clés pour favoriser une répartition plus égalitaire des tâches parentales et promouvoir une vision positive et égalitaire de la parentalité, formulant 44 recommandations pour y parvenir.

      Ces recommandations couvrent l'éducation et l'information, la prise en compte de la parentalité au travail, l'accompagnement des parents dès le désir d'enfant, la refonte des systèmes de congés parentaux et des modes d'accueil, le soutien aux parents d'adolescents et l'accompagnement des familles monoparentales.

      Thèmes Principaux et Idées Clés

      1. La Charge Domestique et Parentale Inégalitaire : Un Frein à l'Égalité des Femmes

      • Division Sexuée Persistante : Malgré une impression d'égalité, les femmes continuent d'assumer la majeure partie des responsabilités domestiques et parentales. En moyenne, elles réalisent 71% des tâches domestiques et 65% des tâches parentales du foyer. Cette division est profondément enracinée dans un héritage historique et des stéréotypes de genre tenaces.
      • Stéréotypes de Genre : L'idée que "les mères savent mieux répondre aux besoins et attentes des enfants que les pères" est très présente chez les adultes (60% y adhèrent) et se perpétue chez les jeunes (54% des 18-24 ans). Ces stéréotypes contribuent à une dévalorisation sociale des tâches considérées comme féminines.
      • "Double Journée" des Mères : L'arrivée des enfants aggrave cette inégalité. Pour les femmes, cela représente environ "cinq heures de travail supplémentaire", tandis que pour les hommes, cela "réduit leur temps domestique et parental de deux heures". Les femmes salariées cumulent travail professionnel, domestique et parental, totalisant "onze heures par jour contre moins de dix heures pour les hommes".
      • Impact du Système de Congés : La différence de durée entre le congé maternité (16 semaines) et le congé paternité (28 jours) renforce la dynamique d'une "mère 'parent principal' et d'un père 'auxiliaire'". Le congé parental est également majoritairement pris par les mères (94% des cas), ce qui pénalise leur carrière.
      • Difficultés des Modes d'Accueil : Le manque et la répartition inégale des places en crèche et chez les assistantes maternelles obligent souvent les mères à compenser les dysfonctionnements du système. "Près de 20% des parents n’obtiennent pas de mode d’accueil, plus de 160 000 ne reprennent pas le travail faute de solution de garde pour leur enfant", les mères étant la "variable d'ajustement".

      2. Conséquences Lourdes pour les Mères : Coût Humain, Économique et Social

      • Pénalité Parentale au Travail : La parentalité a un "impact négatif de la parentalité sur le parcours professionnel des femmes", alors qu'elle n'a "aucun effet ou presque sur l’évolution professionnelle des hommes". "90% des inégalités de revenu entre les femmes et les hommes sont directement dues à la 'pénalité parentale' que subissent les femmes". Dix ans après l'arrivée du premier enfant, le revenu moyen des femmes chute d'environ 38%.
      • Discrimination : Plus de six femmes sur dix estiment qu’être mère est un frein à la carrière. 27% des femmes déclarant être discriminées au travail estiment que cette discrimination est liée à la grossesse ou au congé maternité.
      • Vulnérabilité des Mères Solos : Les mères solos (82% des familles monoparentales) sont particulièrement touchées. Elles subissent une "triple pénalité croisée : leur genre, leur situation professionnelle […], leur situation familiale", les exposant aux emplois précaires et mal rémunérés, et augmentant leur risque de pauvreté. "Près d’une mère seule sur cinq est pauvre alors qu’elle a un emploi".
      • Risque d'Épuisement et Santé Mentale : La charge disproportionnée entraîne un "risque réel d’épuisement pour les mères". L'isolement peut favoriser la dépression post-partum, qui touche environ 20% des femmes et est la "première cause de la mortalité maternelle dans l’année qui suit la naissance de l’enfant".
      • Coût Économique Élevé : Outre la perte de revenus due aux congés maternité et à la réduction d'activité, la séparation a un "lourd coût pour les mères". Une femme séparée sur trois "bascule sous le seuil de pauvreté l’année de la séparation", son niveau de vie baissant d'environ 20% (contre 7% pour les hommes). 39% des enfants vivant en famille monoparentale sont en situation de pauvreté.

      3. Propositions pour une Parentalité Égalitaire

      Les rapporteures formulent 44 recommandations pour transformer les politiques d'accompagnement à la parentalité, axées sur l'égalité :

      Éducation et Information :

      • Mettre en place des "cours d’activités domestiques" à l'école ou au collège pour inculquer des compétences à tous les enfants.
      • Lancer des "campagnes nationales contre les stéréotypes de genre" sur la parentalité.
      • Adopter une "terminologie neutre" (ex: "école pré-élémentaire" au lieu d'"école maternelle", "prestation pour naissance et soin du mineur" au lieu de "congé maternité/paternité").
      • Renforcer l'information des parents sur les dispositifs d'accompagnement.

      Prise en Compte au Travail :

      • Intégrer la parentalité dans la "responsabilité sociétale des entreprises (RSE)" et généraliser la "Charte de la parentalité" aux entreprises de plus de 50 salariés.
      • Modifier le Code du travail pour inclure explicitement la parentalité dans les "négociations d’entreprises relatives à l’égalité professionnelle".
      • Intégrer des critères sur la parentalité dans le futur "index égalité professionnelle".
      • Accorder des "autorisations d’absence" (4 demi-journées/an) aux parents pour les moments clés de la scolarité de leurs enfants.
      • Accompagnement dès le Désir d'Enfant et Post-Partum :
      • Élargir les "consultations pré-conceptionnelles" au projet parental et permettre au second parent d'assister à tous les rendez-vous médicaux obligatoires de la grossesse.
      • Consacrer une séance de préparation à la naissance au "projet parental".
      • Renforcer le dispositif d'arrêt en cas d'interruption de grossesse et l'étendre aux interruptions volontaires, avec une autorisation d'absence pour le conjoint.
      • Faciliter l'accès aux "consultations sur l’allaitement" et renforcer la "formation des praticiens sur la dépression post-partum".
      • Prévoir une "consultation facultative et remboursée à 100% avec un psychologue" pour les mères dans les trois mois après la naissance.
      • Étendre le "congé de 'proche aidant'" au second parent en soutien à la mère souffrant de dépression post-partum.
      • Lutter contre l'isolement des mères en proposant aux parents "d’être mis en relation avec d’autres parents accueillant leur enfant au même moment".

      Réforme des Congés et Modes d'Accueil :

      • Congé Paternité : Porter "progressivement le congé paternité à seize semaines, soit à égalité avec le congé maternité". Huit semaines seraient obligatoires (4 à la naissance, 4 après le congé maternité de la mère) et huit facultatives et fractionnables. Cette mesure est un "levier clé pour l’égalité entre les parents" et répond à l'aspiration des pères à s'investir davantage.
      • Congé Parental : Réformer le congé parental en "renforçant son attractivité financière sans réduire sa portée pour les ménages modestes", et réfléchir à une "reprise progressive" après le congé.
      • Modes d'Accueil : Garantir la "lisibilité et la transparence" des modes de garde, "investir pour augmenter et harmoniser l’offre de crèches sur le territoire", et "revaloriser les métiers de la petite enfance".
      • Soutien aux Parents d'Adolescents :
      • Élargir les missions des "lieux d’accueil enfants-parents" pour qu’ils puissent "recevoir des adolescents".
      • Mettre en place des "politiques publiques ciblant spécifiquement les parents d’adolescents".
      • Renforcer l’offre en "pédopsychiatrie" et la "médecine scolaire" face à la dégradation de la santé mentale des jeunes.
      • Lancer une "campagne d’information nationale sur la santé mentale des enfants et des adolescents".

      Accompagnement des Familles Monoparentales :

      • Repenser le "mode de calcul des pensions alimentaires" pour prendre en compte le coût réel de l’entretien d’un enfant et permettre au parent bénéficiaire de "défiscaliser la pension alimentaire".
      • Instaurer un "abattement sur le montant de la pension alimentaire pris en compte dans les bases ressources des prestations familiales et des aides au logement, à hauteur de l’allocation de soutien familial (ASF)".
      • "Déconjugaliser le versement de l’ASF" et "ouvrir les allocations logement (APL) aux deux parents" pour faciliter l’accueil des enfants.
      • Ouvrir aux mères solos la "possibilité de transférer des droits de congés vers un proche de leur choix" et "doubler les jours 'enfant malade'".
      • Étudier la création d'un "statut des familles monoparentales" avec des droits spécifiques.

      Conclusion des Rapporteures

      Les rapporteures affirment que malgré des évolutions, les mères restent le "parent principal", ce qui a des conséquences négatives sur leur santé et leur vie professionnelle.

      Une "réforme ambitieuse du système des congés", en particulier du congé second parent, est un "moteur d'égalité" essentiel.

      S'inspirant des modèles scandinaves et espagnols, la France peut avancer vers une parentalité égalitaire, non seulement pour l'émancipation des femmes, mais aussi comme réponse aux inquiétudes démographiques.

    1. Step 1: 1/2 bond dissociation energy of Cl2(g) -Known (122 kJ/mol) Step 2: Enthalpy of sublimation of Na(s) - Known (107.3 kJ/mol) Step 3: Ionization energy of Na(g) - Known (496 kJ/mol) Step 4: Electron affinity of Cl(g) - Known (-349 kJ/mol) Step 5: Lattice formation energy of NaCl - Unkown Step 6: Enthalpy of formation of NaCl - Known (-411 kJ/mol)

      this is specific to transition of NaCl but keep the steps in mind

    1. Answer to Question 1.5: Pyrogens cause body temperature to rise by acting on the hypothalamus, which controls the body’s thermostat. They trigger:

      Vasoconstriction – blood vessels narrow, reducing heat loss.

      Shivering – muscles generate more heat.

      Reduced sweating – less heat is lost through evaporation.

      Together, these responses raise body temperature, producing a fever.

      Answer to Question 1.6: Homeostasis is the maintenance of a stable internal environment despite changes in external conditions.

      Answer to Question 1.7:

      1. Stimulus: Body temperature rises or falls.
      2. Sensor: Thermoreceptors detect the change.
      3. Control center: Hypothalamus processes the information.
      4. Effectors: Sweat glands activate and blood vessels dilate (cooling) or muscles shiver and vessels constrict (warming).
      5. Response: Body temperature returns to normal.

      Answer to Question 1.8:

      1. Stimulus: Blood osmolarity increases (too concentrated) or decreases (too diluted).
      2. Sensor: Osmoreceptors in the hypothalamus detect the change.
      3. Control center: Hypothalamus signals the pituitary gland.
      4. Effectors: Kidneys retain or excrete water; thirst may be triggered.
      5. Response: Blood osmolarity returns to normal.

      Answer to Question 1.9: d

      Answer to Question 1.10: c

      Answer to Question 1.11: b

      Answer to Question 1.12:

      1. Stimulus – a change in the internal environment
      2. Sensor/receptor – detects the change
      3. Control center – processes the information and signals effectors
      4. Effector – acts to reverse the change

      If secretion of a body chemical became too great: The negative feedback loop would attempt to reduce or stop secretion to bring levels back to normal. If the system fails, it could lead to imbalances or disease.

      Answer to Question 1.13:

      1. Vasoconstriction – reduces heat loss from skin
      2. Shivering – generates heat through muscle activity
      3. Behavioral responses – seeking shelter, adding insulation, huddling
      4. Hormonal responses – thyroid hormone increases metabolic heat production
    1. Answer to Question 1.2: d

      Possible Answer to Question 1.3:

      Fairyflies (tiny insects <0.2 mm) rely on diffusion for oxygen because their bodies are too small for complex respiratory organs.

      Why size limits life: Too small → not enough cells for organs; too large → diffusion alone can’t supply oxygen efficiently.

      Possible Answer(s) for Question 1.4: 1. Oxygen Availability - Limited oxygen in water or soil can restrict survival. 2. Temperature - Extreme heat or cold can denature proteins and disrupt metabolism. 3. Water availability - Deserts and dry habitats limit hydration and cellular function. 4. Nutrient availability - Scarcity of essential minerals or food limits growth. 5. Size - Small or large body size can restrict diffusion, movement, or organ development.

    1. Fixed activities: 1. 15 hours 2. 30 hours 3. 0 hours 4. 0 hours 5. 56 hours 6. 17.5 hours 7. 10.5 hours 8. 7.5 hours 9. 50 minutes 10. 150 minutes Total: Discretionary Activities: 1. 0 hours 2. 60.5 hours 3. 7 hours 4. 7 hours 5. 7 hours 6. 7 hours 7. 150 minutes

    1. leis
      • Informativo 1187
      • ADI 4854 / RS
      • Órgão julgador: Tribunal Pleno
      • Relator(a): Min. NUNES MARQUES
      • Julgamento: 22/08/2025 (Virtual)
      • Ramo do Direito: Tributário
      • Matéria: ICMS; Regime Especial De Fiscalização; Sanção Política; Obrigações Tributárias Acessórias; Devedor Contumaz

      ICMS: regime especial de fiscalização aplicado aos devedores contumazes

      Resumo - É constitucional — e não configura sanção política nem viola os princípios constitucionais da legalidade tributária (CF/1988, art. 150, I), da liberdade de trabalho e comércio (CF/1988, art. 5º, XIII; e 170, parágrafo único), bem como o da igualdade tributária (CF/1988, arts. 5º, caput; e 150, II) — norma estadual que institui Regime Especial de Fiscalização (REF), aplicável aos contribuintes considerados <u>devedores contumazes</u> de ICMS.

      • Conforme jurisprudência desta Corte (1), a submissão de contribuinte inadimplente a regime fiscal diferenciado não configura sanção política, desde que a medida não inviabilize o exercício da atividade empresarial e observe critérios de proporcionalidade e razoabilidade.
      • Na espécie, a norma estadual impugnada estabelece critérios objetivos para a qualificação de contribuintes como devedores contumazes e institui o REF. O seu conteúdo não se sujeita à reserva de lei complementar federal, na medida em que (i) não trata de elementos essenciais do tributo, como fato gerador, lançamento ou crédito tributário (CF/1988, art. 146, III, b); e (ii) não institui mecanismos coercitivos de cobrança, como os vedados pelas Súmulas 70, 323 e 547 do STF (2).
      • Portanto, inexiste violação ao princípio da legalidade tributária. Diante disso, o REF representa instrumento legítimo de controle tributário e sua validade decorre da própria legislação tributária (CTN/1966, art. 96), de modo que é compatível com a competência do ente federado para disciplinar obrigações acessórias (3). As medidas previstas, como a alteração de prazos de recolhimento e a intensificação da fiscalização, não impedem o exercício da atividade econômica, pois se aplicam somente aos casos graves e reiterados de inadimplência.
      • Trata-se, portanto, de mecanismo excepcional e proporcional, voltado à indução de condutas regulares e à preservação da arrecadação. Além disso, a previsão de exclusão do REF pelos titulares originários de créditos decorrentes de precatórios inadimplidos não infringe o princípio da igualdade tributária. Como não há identidade de situações entre credores originários e cessionários de precatórios, é legitimo conferir tratamento diferenciado aos que possuem relação direta e reconhecida com o poder público, especialmente em cenários voltados para a simplificação da fiscalização e proteção do patrimônio público.
      • Com base nesses e em outros entendimentos, o Plenário, por unanimidade, conheceu parcialmente da ação e, nessa extensão, a julgou improcedente para confirmar a presunção de constitucionalidade dos arts. 2º, §§ 1º, 2º e 3º, e art. 3º, ambos da Lei nº 13.711/2011 do Estado do Rio Grande do Sul (4), na redação dada pela Lei gaúcha nº 14.180/2012.

      (1) Precedentes citados: RE 486.175 AgR-EDv, ARE 1.349.448 AgR e ADI 3.952. (2) Enunciados sumulares citados: Súmula 70/STF, Súmula 323/STF e Súmula 547/STF. (3) CTN/1966: “Art. 113. A obrigação tributária é principal ou acessória. (...) § 2º A obrigação acessória decorre da legislação tributária e tem por objeto as prestações, positivas ou negativas, nela previstas no interesse da arrecadação ou da fiscalização dos tributos.” (4) Lei nº 13.711/2011 do Estado do Rio Grande do Sul: “Art. 2.º O contribuinte será considerado como devedor contumaz e ficará submetido a Regime Especial de Fiscalização, conforme disposto em regulamento, quando qualquer de seus estabelecimentos situados no Estado, sistematicamente, deixar de recolher o ICMS devido nos prazos previstos no Regulamento do Imposto sobre Operações Relativas à Circulação de Mercadorias e sobre Prestações de Serviços de Transporte Interestadual e Intermunicipal e de Comunicação - RICMS. § 1.º Para efeitos deste artigo, considera-se como devedor contumaz o contribuinte que: I - deixar de recolher o ICMS declarado em Guia de Informação e Apuração do ICMS - GIA -, em oito meses de apuração do imposto nos últimos doze meses anteriores ao corrente (Redação dada pela Lei n.º 14.180/12); II - tiver créditos tributários inscritos como Dívida Ativa em valor superior a 38.500 UPFs-RS, decorrente de imposto não declarado em GIA, em oito meses de apuração do imposto nos últimos doze meses anteriores ao corrente; ou (Redação dada pela Lei n.º 14.180/12) III - tiver créditos tributários inscritos como Dívida Ativa em valor que ultrapasse: (Incluído pela Lei n.º 14.180/12) a) 30% (trinta por cento) do seu patrimônio conhecido; ou (Incluído pela Lei n.º 14.180/12); b) 25% (vinte e cinco por cento) do faturamento anual declarado em GIA ou em Guia Informativa - GI -. (Incluído pela Lei n.º 14.180/12) § 2.º Não serão considerados devedores contumazes, para os termos a que se refere o ‘caput’ do art. 2.º, as pessoas físicas ou jurídicas, titulares originários de créditos oriundos de precatórios inadimplidos pelo Estado e suas autarquias, até o limite do respectivo débito tributário constante de Dívida Ativa. § 3.º Não serão computados para os efeitos deste artigo os débitos cuja exigibilidade esteja suspensa nos termos do Código Tributário Nacional. (...) Art. 3.º O contribuinte deixará de ser considerado como devedor contumaz se os débitos que motivaram essa condição forem extintos ou tiverem sua exigibilidade suspensa.”

      Legislação: CF/1988: arts. 5º, caput, XIII; 146, III, b; 150, I e II e 170, parágrafo único. CTN/1966: arts. 96 e 113, § 2º. Lei nº 13.711/2011 do Estado do Rio Grande do Sul: arts. 2º, §§ 1º, 2º e 3º, e art. 3º. Lei nº 14.180/2012 do Estado do Rio Grande do Sul.

      Precedentes: RE 486.175 AgR-EDv, ARE 1.349.448 AgR, ADI 3.952, Súmula 70/STF, Súmula 323/STF e Súmula 547/STF.

    1. Reviewer #2 (Public review):

      McDougal et al. describe the surprising finding that IFIT1 proteins from different mammalian species inhibit replication of different viruses, indicating that evolution of IFIT1 across mammals has resulted in host species-specific antiviral specificity. Before this work, research into the antiviral activity and specificity of IFIT1 had mostly focused on the human ortholog, which was described to inhibit viruses including vesicular stomatitis virus (VSV) and Venezuelan equine encephalitis virus (VEEV) but not other viruses including Sindbis virus (SINV) and parainfluenza virus type 3 (PIV3). In the current work, the authors first perform evolutionary analyses on IFIT1 genes across a wide range of mammalian species and reveal that IFIT1 genes have evolved under positive selection in primates, bats, carnivores, and ungulates. Based on these data, they hypothesize that IFIT1 proteins from these diverse mammalian groups may show distinct antiviral specificities against a panel of viruses. By generating human cells that express IFIT1 proteins from different mammalian species, the authors show a wide range of antiviral activities of mammalian IFIT1s. Most strikingly, they find several IFIT1 proteins that have completely different antiviral specificities relative to human IFIT1, including IFIT1s that fail to inhibit VSV or VEEV, but strongly inhibit PIV3 or SINV. These results indicate that there is potential for IFIT1 to inhibit a much wider range of viruses than human IFIT1 inhibits. Electrophoretic mobility shift assays (EMSAs) suggest that some of these changes in antiviral specificity can be ascribed to changes in direct binding of viral RNAs. Interestingly, they also find that chimpanzee IFIT1, which is >98% identical to human IFIT1, fails to inhibit any tested virus. Replacing three residues from chimpanzee IFIT1 with those from human IFIT1, one of which has evolved under positive selection in primates, restores activity to chimpanzee IFIT1. Together, these data reveal a vast diversity of IFIT1 antiviral specificity encoded by mammals, consistent with an IFIT1-virus evolutionary "arms race".

      Overall, this is a very interesting and well-written manuscript that combines evolutionary and functional approaches to provide new insight into IFIT1 antiviral activity and species-specific antiviral immunity. The conclusion that IFIT1 genes in several mammalian lineages are evolving under positive selection is supported by the data. The virology results, which convincingly show that IFIT1s from different species have distinct antiviral specificity, are the most surprising and exciting part of the paper. As such, this paper will be interesting for researchers studying mechanisms of innate antiviral immunity, as well as those interested in species-specific antiviral immunity. Moreover, it may prompt others to test a wide range of orthologs of antiviral factors beyond those from humans or mice, which could further the concept of host-specific innate antiviral specificity. Additional areas for improvement, which are mostly to clarify the presentation of data and conclusions, are described below.

      Strengths:

      (1) This paper is a very strong demonstration of the concept that orthologous innate immune proteins can evolve distinct antiviral specificities. Specifically, the authors show that IFIT1 proteins from different mammalian species are able to inhibit replication of distinct groups of viruses, which is most clearly illustrated in Figure 4G. This is an unexpected finding, as the mechanism by which IFIT1 inhibits viral replication was assumed to be similar across orthologs. While the molecular basis for these differences remains unresolved, this is a clear indication that IFIT1 evolution functionally impacts host-specific antiviral immunity and that IFIT1 has the potential to inhibit a much wider range of viruses than previously described.

      (2) By revealing these differences in antiviral specificity across IFIT1 orthologs, the authors highlight the importance of sampling antiviral proteins from different mammalian species to understand what functions are conserved and what functions are lineage- or species-specific. These results might therefore prompt similar investigations with other antiviral proteins, which could reveal a previously undiscovered diversity of specificities for other antiviral immunity proteins.

      (3) The authors also surprisingly reveal that chimpanzee IFIT1 shows no antiviral activity against any tested virus despite only differing from human IFIT1 by eight amino acids. By mapping this loss of function to three residues on one helix of the protein, the authors shed new light on a region of the protein with no previously known function.

      (4) Combined with evolutionary analyses that indicate that IFIT1 genes are evolving under positive selection in several mammalian groups, these functional data indicate that IFIT1 is engaged in an evolutionary "arms race" with viruses, which results in distinct antiviral specificities of IFIT1 proteins from different species.

      Weaknesses:

      (1) Some of the results and discussion text could be more focused on the model of evolution-driven changes in IFIT1 specificity. In particular, the majority of the residue mapping is on the chimpanzee protein, where it would appear that this protein has lost all antiviral function, rather than changing its antiviral specificity like some other examples in this paper. As such, the connection between the functional mapping of individual residues with the positive selection analysis and changes in antiviral specificity is not present. While the model that changes in antiviral specificity have been positively selected for is intriguing, it is not supported by data in the paper.

      (2) The strength of the differences in antiviral specificity could be highlighted to a greater degree. Specifically, the text describes a number of interesting examples of differences in inhibition of viruses from Figure 3C and 3D, and 4C-F. The revised version has added some clarity by at least providing raw data for 3C and 3D for the reader to make their own comparisons, but it is still difficult to quickly assess which are the most interesting comparisons to make (e.g. for future mapping of residues that might be important).

    2. Reviewer #3 (Public review):

      Summary:

      This manuscript by McDougal et al, demonstrates species-specific activities of diverse IFIT1 orthologs, and seeks to utilize evolutionary analysis to identify key amino acids under positive selection that contribute to antiviral activity of this host factor. While the authors identify amino acid residues important for antiviral activity of some orthologs, and propose a possible mechanism by which these residues may function, the significance or applicability of these findings to other orthologs is unclear. However, the subject matter is of interest to the field, and these findings contribute to the body of knowledge regarding IFIT1 evolution.

      Strengths:

      Assessment of multiple IFIT1 orthologs shows the wide variety of antiviral activity of IFIT1, and identification of residues outside of the known RNA binding pocket in the protein suggests additional novel mechanisms which may regulate IFIT1 activity.

      Weaknesses:

      Given that there appears to be very little overlap observed in orthologs that inhibited the viruses tested, it's possible that other amino acids may be key drivers of antiviral activity in these other orthologs. Thus, it's difficult to conclude whether the findings that residues 362/4/6 are important for IFIT1 activity can be broadly applied to other orthologs, or whether these are unique to human and chimpanzee IFIT1. While additional molecular studies of the impact of these mutations on IFIT1 function (e.g. impact on IFIT complex formation) would lend further insight, as it stands, these findings demonstrate a role for these residues in IFIT1 activity.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      McDougal et al. aimed to characterize the antiviral activity of mammalian IFIT1 orthologs. They first performed three different evolutionary selection analyses within each major mammalian clade and identified some overlapping positive selection sites in IFIT1. They found that one site that is positively selected in primates is in the RNA-binding exit tunnel of IFIT1 and is tolerant of mutations to amino acids with similar biochemical properties. They then tested 9 diverse mammalian IFIT1 proteins against VEEV, VSV, PIV3, and SINV and found that each ortholog has distinct antiviral activities. Lastly, they compared human and chimpanzee IFIT1 and found that the determinant of their differential anti-VEEV activity may be partly attributed to their ability to bind Cap0 RNA. 

      Strengths: 

      The study is one of the first to test the antiviral activity of IFIT1 from diverse mammalian clades against VEEV, VSV, PIV3, and SINV. Cloning and expressing these 39 IFIT1 orthologs in addition to single and combinatorial mutants is not a trivial task. The positive connection between anti-VEEV activity and Cap0 RNA binding is interesting, suggesting that differences in RNA binding may explain differences in antiviral activity. 

      Weaknesses: 

      The evolutionary selection analyses yielded interesting results, but were not used to inform follow-up studies except for a positively selected site identified in primates. Since positive selection is one of the two major angles the authors proposed to investigate mammalian IFIT1 orthologs with, they should integrate the positive selection results with the rest of the paper more seamlessly, such as discussing the positive selection results and their implications, rather than just pointing out that positively selected sites were identified. The paper should elaborate on how the positive selection analyses PAML, FUBAR, and MEME complement one another to explain why the tests gave them different results. Interestingly, MEME which usually provides more sites did not identify site 193 in primates that was identified by both PAML and FUBAR. The authors should also provide the rationale for choosing to focus on the 3 sites identified in primates only. One of those sites, 193, was also found to be positively selected in bats, although the authors did not discuss or integrate that finding into the study. In Figure 1A, they also showed a dN/dS < 1 from PAML, which is confusing and would suggest negative selection instead of positive selection. Importantly, since the authors focused on the rapidly evolving site 193 in primates, they should test the IFIT1 orthologs against viruses that are known to infect primates to directly investigate the impact of the evolutionary arms race at this site on IFIT1 function. 

      We thank the reviewer for their assessment and for acknowledging the breadth of our dataset regarding diverse IFIT1s, number of viruses tested, and the functional data that may correlate biochemical properties of IFIT1 orthologous proteins with antiviral function. We have expanded the introduction and results sections to better explain and distinguish between PAML, FUBAR, and MEME analyses. Furthermore, we have expanded the discussion to incorporate the observation that site 193 is rapidly evolving in bats, as well as the observation that nearby sites to the TPR4 loop were identified as rapidly evolving in all clades of mammals tested. We also do observe an overall gene dN/dS of <1, however this is simply the average across all codons of the entire gene and does not rule out positive selection at specific sites. This is observed for other restriction factors, as many domains are undergoing purifying selection to retain core functions (e.g enzymatic function, structural integrity) while other domains (e.g. interfaces with viral antagonists or viral proteins) show strong positive selection. Specific examples include the restriction factors BST-2/Tetherin (PMID: 19461879) and MxA (PMID: 23084925). Furthermore, we agree that testing more IFIT1-sensitive viruses that naturally infect primates with our IFIT1 193 mutagenesis library would shed light on the influence of host-virus arms races at this site. However, VEEV naturally does also infect humans as well as at least one other species of primate (PMID: 39983680).

      Below we individually address the reviewers' claims of inaccurate data interpretation.

      Some of the data interpretation is not accurate. For example: 

      (1) Lines 232-234: "...western blot analysis revealed that the expression of IFIT1 orthologs was relatively uniform, except for the higher expression of orca IFIT1 and notably lower expression of pangolin IFIT1 (Figure 4B)." In fact, most of the orthologs are not expressed in a "relatively uniform" manner e.g. big brown bat vs. shrew are quite different. 

      We have now included quantification of the western blots to allow the reader to compare infection results with the infection data (Updated Figure 4B and 4G). We have also removed the phrase “relatively uniform” from the text and have instead included text describing the quantified expression differences.

      (2) Line 245: "...mammalian IFIT1 species-specific differences in viral suppression are largely independent of expression differences." While it is true that there is no correlation between protein expression and antiviral activity in each species, the authors cannot definitively conclude that the species-specific differences are independent of expression differences. Since the orthologs are clearly not expressed in the same amounts, it is impossible to fully assess their true antiviral activity. At the very least, the authors should acknowledge that the protein expression can affect antiviral activity. They should also consider quantifying the IFIT1 protein bands and normalizing each to GAPDH for readers to better compare protein expression and antiviral activity. The same issue is in Line 267. 

      We have now included quantification and normalization of the western blots to allow the reader to compare infection results with the infection data (Updated Figure 4B and 4G). Furthermore, we acknowledge in the text that expression differences may affect antiviral potency in infection experiments.

      (3) Line 263: "SINV... was modestly suppressed by pangolin, sheep, and chinchilla IFIT1 (Figure 4E)..." The term "modestly suppressed" does not seem fitting if there is 60-70% infection in cells expressing pangolin and chinchilla IFIT1. 

      We have modified the text to say “significantly suppressed” rather than “modestly suppressed.”

      (4) The study can be significantly improved if the authors can find a thread to connect each piece of data together, so the readers can form a cohesive story about mammalian IFIT1. 

      We appreciate the reviewer’s suggestion and have tried to make the story including more cohesive through commentary on positive selection and by using the computational analysis to first inform potential evolutionary consequences of IFIT1 functionality first by an intraspecies (human) approach, and then later an interspecies approach with diverse mammals that have great sequence diversity. Furthermore, we point out that almost all IFIT1s tested in the ortholog screen were also included in our computational analysis allowing for the potential to connect functional observations with those seen in the evolutionary analyses.

      Reviewer #2 (Public review): 

      McDougal et al. describe the surprising finding that IFIT1 proteins from different mammalian species inhibit the replication of different viruses, indicating that the evolution of IFIT1 across mammals has resulted in host speciesspecific antiviral specificity. Before this work, research into the antiviral activity and specificity of IFIT1 had mostly focused on the human ortholog, which was described to inhibit viruses including vesicular stomatitis virus (VSV) and Venezuelan equine encephalitis virus (VEEV) but not other viruses including Sindbis virus (SINV) and parainfluenza virus type 3 (PIV3). In the current work, the authors first perform evolutionary analyses on IFIT1 genes across a wide range of mammalian species and reveal that IFIT1 genes have evolved under positive selection in primates, bats, carnivores, and ungulates. Based on these data, they hypothesize that IFIT1 proteins from these diverse mammalian groups may show distinct antiviral specificities against a panel of viruses. By generating human cells that express IFIT1 proteins from different mammalian species, the authors show a wide range of antiviral activities of mammalian IFIT1s. Most strikingly, they find several IFIT1 proteins that have completely different antiviral specificities relative to human IFIT1, including IFIT1s that fail to inhibit VSV or VEEV, but strongly inhibit PIV3 or SINV. These results indicate that there is potential for IFIT1 to inhibit a much wider range of viruses than human IFIT1 inhibits. Electrophoretic mobility shift assays (EMSAs) suggest that some of these changes in antiviral specificity can be ascribed to changes in the direct binding of viral RNAs. Interestingly, they also find that chimpanzee IFIT1, which is >98% identical to human IFIT1, fails to inhibit any tested virus. Replacing three residues from chimpanzee IFIT1 with those from human IFIT1, one of which has evolved under positive selection in primates, restores activity to chimpanzee IFIT1. Together, these data reveal a vast diversity of IFIT1 antiviral specificity encoded by mammals, consistent with an IFIT1-virus evolutionary "arms race". 

      Overall, this is a very interesting and well-written manuscript that combines evolutionary and functional approaches to provide new insight into IFIT1 antiviral activity and species-specific antiviral immunity. The conclusion that IFIT1 genes in several mammalian lineages are evolving under positive selection is supported by the data, although there are some important analyses that need to be done to remove any confounding effects from gene recombination that has previously been described between IFIT1 and its paralog IFIT1B. The virology results, which convincingly show that IFIT1s from different species have distinct antiviral specificity, are the most surprising and exciting part of the paper. As such, this paper will be interesting for researchers studying mechanisms of innate antiviral immunity, as well as those interested in species-specific antiviral immunity. Moreover, it may prompt others to test a wide range of orthologs of antiviral factors beyond those from humans or mice, which could further the concept of host-specific innate antiviral specificity. Additional areas for improvement, which are mostly to clarify the presentation of data and conclusions, are described below. 

      Strengths: 

      (1) This paper is a very strong demonstration of the concept that orthologous innate immune proteins can evolve distinct antiviral specificities. Specifically, the authors show that IFIT1 proteins from different mammalian species are able to inhibit the replication of distinct groups of viruses, which is most clearly illustrated in Figure 4G. This is an unexpected finding, as the mechanism by which IFIT1 inhibits viral replication was assumed to be similar across orthologs. While the molecular basis for these differences remains unresolved, this is a clear indication that IFIT1 evolution functionally impacts host-specific antiviral immunity and that IFIT1 has the potential to inhibit a much wider range of viruses than previously described. 

      (2) By revealing these differences in antiviral specificity across IFIT1 orthologs, the authors highlight the importance of sampling antiviral proteins from different mammalian species to understand what functions are conserved and what functions are lineage- or species-specific. These results might therefore prompt similar investigations with other antiviral proteins, which could reveal a previously undiscovered diversity of specificities for other antiviral immunity proteins. 

      (3) The authors also surprisingly reveal that chimpanzee IFIT1 shows no antiviral activity against any tested virus despite only differing from human IFIT1 by eight amino acids. By mapping this loss of function to three residues on one helix of the protein, the authors shed new light on a region of the protein with no previously known function. 

      (4) Combined with evolutionary analyses that indicate that IFIT1 genes are evolving under positive selection in several mammalian groups, these functional data indicate that IFIT1 is engaged in an evolutionary "arms race" with viruses, which results in distinct antiviral specificities of IFIT1 proteins from different species. 

      Weaknesses: 

      (1) The evolutionary analyses the authors perform appear to indicate that IFIT1 genes in several mammalian groups have evolved under positive selection. However, IFIT1 has previously been shown to have undergone recurrent instances of recombination with the paralogous IFIT1B, which can confound positive selection analyses such as the ones the authors perform. The authors should analyze their alignments for evidence of recombination using a tool such as GARD (in the same HyPhy package along with MEME and FUBAR). Detection of recombination in these alignments would invalidate their positive selection inferences, in which case the authors need to either analyze individual non-recombining domains or limit the number of species to those that are not undergoing recombination. While it is likely that these analyses will still reveal a signature of positive selection, this step is necessary to ensure that the signatures of selection and sites of positive selection are accurate. 

      (2) The choice of IFIT1 homologs chosen for study needs to be described in more detail. Many mammalian species encode IFIT1 and IFIT1B proteins, which have been shown to have different antiviral specificity, and the evolutionary relationship between IFIT1 and IFIT1B paralogs is complicated by recombination. As such, the assertion that the proteins studied in this manuscript are IFIT1 orthologs requires additional support than the percent identity plot shown in Figure 3B. 

      (3) Some of the results and discussion text could be more focused on the model of evolution-driven changes in IFIT1 specificity. In particular, the chimpanzee data are interesting, but it would appear that this protein has lost all antiviral function, rather than changing its antiviral specificity like some other examples in this paper. As such, the connection between the functional mapping of individual residues with the positive selection analysis is somewhat confusing. It would be more clear to discuss this as a natural loss of function of this IFIT1, which has occurred elsewhere repeatedly across the mammalian tree. 

      (4) In other places in the manuscript, the strength of the differences in antiviral specificity could be highlighted to a greater degree. Specifically, the text describes a number of interesting examples of differences in inhibition of VSV versus VEEV from Figure 3C and 3D, but it is difficult for a reader to assess this as most of the dots are unlabeled and the primary data are not uploaded. A few potential suggestions would be to have a table of each ortholog with % infection by VSV and % infection by VEEV. Another possibility would be to plot these data as an XY scatter plot. This would highlight any species that deviate from the expected linear relationship between the inhibition of these two viruses, which would provide a larger panel of interesting IFIT1 antiviral specificities than the smaller number of species shown in Figure 4. 

      We thank the reviewer for their fair assessment of our manuscript. As the reviewer requested, we performed GARD analysis on our alignments used for PAML, FUBAR, and MEME (New Supp Fig 1). By GARD, we found 1 or 2 predicted breakpoints in each clade. However, much of the sequence was after or between the predicted breakpoints. Therefore, we were able to reanalyze for sites undergoing positive selection in the large region of the sequence that do not span the breakpoints. We were able to validate almost all sites originally identified as undergoing positive selection still exhibit signatures of positive selection taking these breakpoints into account: primates (11/12), bats (14/16), ungulates (30/37), and carnivores (2/4). To further validate our positive selection analysis, we used Recombination Detection Program 4 (RDP4) to remove inferred recombinant sequences from the primate IFIT1 alignment and performed PAML, FUBAR, and MEME. Once again, the sites in our original anlaysis were largely validated by this method. Importantly, sites 170, 193, and 366 in primates, which are discussed in our manuscript, were found to be undergoing positive selection in 2 of the 3 analyses using alignments after the indicated breakpoint in GARD and after removal of recombinant sequences by RDP4. We have updated the text to acknowledge IFIT1/IFIT1B recombination more clearly and include the GARD analysis as well as PAML, FUBAR, and MEME reanalysis taking into account predicted breakpoints by GARD and RDP4. Furthermore, to increase evidence that the sequences used in this study for both computational and functional analysis are IFIT1 orthologs rather than IFIT1B, we have included a maximum likelihood tree after aligning coding sequences on the C-terminal end (corresponding to bases 907-1437 of IFIT1). In Daughtery et al. 2016 (PMID: 27240734) this strategy was used to distinguish between IFIT1 and IFITB. All sequences used in our study grouped with IFIT1 sequences (including many confirmed IFIT1 sequences used in Daughterty et al.) rather than IFIT1B sequences or IFIT3. This new data, including the GARD, RDP4, and maximum likelihood tree is included as a new Supplementary Figure 1.

      We also agree with the reviewer that it is possible that chimpanzee IFIT1 has lost antiviral function due to the residues 364 and 366 that differ from human IFIT1. We have updated the discussion sections to include the possibility that chimpanzee IFIT1 is an example of a natural loss of function that has occurred in other species over evolution as well as the potential consequences of this occurrence. Regarding highlighting the strength of differences in antiviral activity between IFIT1 orthologs, we have included several updates to strengthen the ability of the reader to assess these differences. First, we have included a supplementary table that includes the infection data for each ortholog from the VEEV and VSV screen to allow for readers to evaluate ranked antiviral activity of the species that suppress these viruses. In addition, the silhouettes next to the dot plots indicate the top ranked hits in order of viral inhibition (with the top being the most inhibitory) giving the reader a visual representation in the figure of top antiviral orthologs during our screen. We have also updated the figure legend to inform the reader of this information.

      Reviewer #3 (Public Review):  

      Summary: 

      This manuscript by McDougal et al, demonstrates species-specific activities of diverse IFIT1 orthologs and seeks to utilize evolutionary analysis to identify key amino acids under positive selection that contribute to the antiviral activity of this host factor. While the authors identify amino acid residues as important for the antiviral activity of some orthologs and propose a possible mechanism by which these residues may function, the significance or applicability of these findings to other orthologs is unclear. However, the subject matter is of interest to the field, and these findings could be significantly strengthened with additional data.

      Strengths:

      Assessment of multiple IFIT1 orthologs shows the wide variety of antiviral activity of IFIT1, and identification of residues outside of the known RNA binding pocket in the protein suggests additional novel mechanisms that may regulate IFIT1 activity.

      Weaknesses:

      Consideration of alternative hypotheses that might explain the variable and seemingly inconsistent antiviral activity of IFIT1 orthologs was not really considered. For example, studies show that IFIT1 activity may be regulated by interaction with other IFIT proteins but was not assessed in this study.

      Given that there appears to be very little overlap observed in orthologs that inhibited the viruses tested, it's possible that other amino acids may be key drivers of antiviral activity in these other orthologs. Thus, it's difficult to conclude whether the findings that residues 362/4/6 are important for IFIT1 activity can be broadly applied to other orthologs, or whether these are unique to human and chimpanzee IFIT1. Similarly, while the hypothesis that these residues impact IFIT1 activity in an allosteric manner is an attractive one, there is no data to support this.  

      We thank the reviewer for their fair assessment of our manuscript. To address the weaknesses that the reviewer has pointed out we have expanded the discussion to more directly address alternate hypotheses, such as the possibility of IFIT1 activity being regulated by interaction with other IFIT proteins. Furthermore, we expanded the discussion to include an alternate hypothesis for the role of residues 364 and 366 in primate IFIT1 besides allosteric regulation. In addition, we did not intend to claim or imply that residues 364/6 are the key drivers of antiviral activity for all IFITs tested. However, we speculate that within primates these residues may play a key role as these residues differ between chimpanzee IFIT1 (which lacks significant antiviral activity towards the viruses tested in this study) and human IFIT1 (which possesses significant antiviral activity). In addition, these residues seem to be generally conserved in primate species, apart from chimpanzee IFIT1. We have included changes to the text to more clearly indicate that we highlight the importance of these residues specifically for primate IFIT1, but not necessarily for all IFIT1 proteins in all clades.

      Reviewer #1 (Recommendations for the authors): 

      (1) The readers would benefit from a more detailed background on the concept and estimation of positive selection for the readers, including the M7/8 models in PAML. 

      We have included more information in the text to provide a better background for the concepts of positive selection and how PAML tests for this using M7 and M8 models.

      (2) Presentation of data 

      a) Figure 3C and 3D: is there a better way to present the infection data so the readers can tell the ranked antiviral activity of the species that suppress VEEV? 

      We have included a supplementary table that includes the infection data for each ortholog from the VEEV and VSV screen to allow for readers to evaluate ranked antiviral activity of the species that suppress these viruses. In addition, the silhouettes next to the dot plots indicate the top ranked hits in order of viral inhibition (with the top being the most inhibitory). We have updated the figure legend to inform the reader of this information as well.

      b) Figure 4C and 4D: consider putting the western blot in Supplementary Figure 1 underneath the infection data or with the heatmap so readers can compare it with the antiviral activity. 

      We have also included quantification of the western blots performed to evaluate IFIT1 expression during the experiments shown in Figure 4C and 4D in an updated Figure 4B. We have also included normalized expression values with the heatmap shown in an updated Figure 4G so the reader can evaluate potential impact of protein expression on antiviral activity for all infection experiments shown in figure 4.

      (3) Line 269-270: as a rationale for narrowing the species to human, black flying fox, and chimp IFIT1, human and black flying fox were chosen because they strongly inhibit VEEV, but pangolin wasn't included even though it had the strongest anti-VEEV activity? 

      The rationale for narrowing the species to human, black flying fox, and chimpanzee IFIT1 was related to the availability of biological tools, high quality genome/transcriptome sequencing databases, and other factors. Specifically human and chimp IFIT1 are closely related but have variable antiviral activities, making their comparison highly relevant. Bats are well established as reservoirs for diverse viruses, whereas the reservoir status of many other mammals is less well defined. Furthermore, purifying large amounts of high quality IFIT1 protein after bacterial expression was another limitation to functional studies. We have added this information into the manuscript text.

      (4) Figure 5A: to strengthen the claim that "species-specific antiviral activities of IFIT1s can be partly explained by RNA binding potential", it would be good to include one more positive and one more negative control. In other words, test the cap0 RNA binding activity of an IFIT1 ortholog that strongly inhibits VEEV and an ortholog that does not. It would also be good to discuss why chimp IFIT1 still shows dose-dependent RNA binding yet it is one of the weakest at inhibiting VEEV. 

      We appreciate the reviewer's suggestion to include more controls and expand the dataset. While we understand the potential value of expanding the dataset, we believe that human IFIT1 serves as a robust positive control and human IFIT1 R187 (RNA-binding deficient) serves as an established negative control. Future experiments with other purified IFITs from other species will indeed strengthen evidence linking IFIT1 species-specific activity and RNA-binding.

      Regarding chimpanzee IFIT1, we acknowledge there appears to be some dose-dependent Cap0 RNA-binding. However, the binding affinity is much weaker than that of human or black flying fox IFIT1. We speculate that during viral infection reduced binding affinity could impair the ability of chimpanzee IFIT1 to efficiently sequester viral RNA and inhibit viral translation. This reduction in binding affinity may, therefore, allow the cell to be overwhelmed by the exponential increase in viral RNA during replication resulting in an ineffective antiviral IFIT1. In the literature, a similar phenomenon is observed by Hyde et. al (PMID: 24482115). In this study, the authors test mouse Ifit1 Cap0 RNA binding by EMSA of the 5’ UTR sequence of VEEV RNA containing an A or G at nucleotide position 3. EMSA shows binding of both the A3 and G3 Cap0 VEEV RNA sequences, however stronger Ifit1 binding is observed for A3 Cap0 RNA sequence. The consequences of the reduced Ifit1 binding of the G3 Cap0 VEEV RNA are observed in vitro by a substantial increase in viral titers produced from cells as well as an increase in protein produced in a luciferase-based translation assay. The authors also show in vivo relevance of this reduction of Ifit1 binding as WT B6 mice infected with VEEV containing the A3 UTR exhibited 100% survival, while WT B6 mice infected with VEEV containing the G3 UTR survived at a rate of only ~25%. Therefore, the literature supports that a decrease in Cap0 RNA binding by an IFIT protein (while still exhibiting Cap0 RNA binding) observed by EMSA can result in considerable alterations of viral infection both in vitro and in vivo.

      Minor: 

      (1) Line 82: "including 5' triphosphate (5'-ppp-RNA), or viral RNAs..." having a comma here will make the sentence clearer. 

      We have improved the clarity of this sentence. It now reads, “IFIT1 binds uncapped 5′triphosphate RNA (5′-ppp-RNA) and capped but unmethylated RNA (Cap0, an m<sup>7</sup>G cap lacking 2′-O methylation).”

      (2) Line 100: "...similar mechanisms have been at least partially evolutionarily conserved in IFIT proteins to restrict viral infection by IFIT proteins". 

      We have updated the text to improve clarity by revising the sentence to “VEEV TC-83 is sensitive to human IFIT1 and mouse Ifit1B, indicating at least partial conservation of antiviral function by IFIT proteins."

      (3) Line 109: "signatures of rapid evolution or positive selection" would put positive selection second because that is the more technical term that can benefit from the more layperson term (rapid evolution). 

      We have updated this sentence incorporating this suggestion. “Positive selection, or rapid evolution, is denoted by a high ratio of nonsynonymous to synonymous substitutions (dN/dS >1).”

      (4) Lines 116-117: "However, this was only assessed in a few species" would benefit from a citation. 

      We have inserted the citation.

      (5) Line 127 heading: "IFIT1 is rapidly evolving in mammals" would be more accurate to say "in major clades of mammals". 

      We have updated the text to include this suggestion.

      (6) Line 165: "IFIT1 L193 mutants". 

      We have updated the text to rephrase this for clarity.

      (7) Line 170: two strains of VEEV were mentioned in the Intro, so it would be good to specify which strain of VEEV was used?

      We have updated the text to clarify the VEEV strain. In this study, all experiments were performed using the VEEV TC-83 strain.

      (8) Line 174: "Indeed, all mutants at position 193, whether hydrophobic or positively charged, inhibited VEEV similarly to the WT..." It should read "all hydrophobic and positively charged mutants inhibited VEEV similarly to the WT...". 

      We corrected as suggested. 

      (9) Line 204: what are "control cells"? Cells that are mock-infected, or cells without IFIT1? 

      We have updated the text to improve clarity. What we refer to as control cells, were cells expressing an empty vector control rather than an IFIT1.

      (10) Need to clarify n=2 and n=3 replicates throughout the manuscript. Does that refer to three independent experiments? Or an experiment with triplicate wells/samples? 

      We have updated the text to say “independent experiments” instead of “biological replicates” to prevent any confusion.  All n=2 or n=3 replicates denote independent experiments.

      (11) Line 254: "dominant antiviral effector against the related human parainfluenza virus type 5..." 

      We have updated the text to improve clarity.

      (12) Line 271: "The black flying fox (Pteropus alecto), is a model megabat species..." scientific name was italicized here but not elsewhere. Remove comma.

      We have updated the text accordingly.

      (13) Line 293: "...chimpanzee IFIT1 lacked these properties" but chimp IFIT1 can bind cap0 RNA, just at a lower level. 

      We have updated the text to acknowledge that chimpanzee IFIT1 can bind cap0 RNA, albeit at a lower level than human IFIT1.

      (14) Figure 6B: please fix the x-axis labels. They're very cramped. 

      We have updated the x-axis labels for figure 6B and figure 6D to improve clarity.

      (15) Line 609: "...trimmed and aligned"? 

      Our phrasing is to indicate that coding sequences were aligned, and gaps were removed to reduce the chance of false positive signal by underrepresented codons such as gaps or short insertions. We have removed “trimmed” from the text and changed the text to say “aligned sequences” to increase clarity.

      Reviewer #2 (Recommendations for the authors): 

      (1) Numbers less than 10 should be spelled out throughout the manuscript (e.g. line 138). 

      We have updated the text to reflect the request.

      (2) Line 165: "expression of IFIT1 193 mutants" should be rephrased. 

      We have updated the text to rephrase this sentence for clarity.

      (3) A supplemental table or file should be included that contains the accession number and species names of sequences used for evolutionary analyses and for functional testing. In addition, the alignments that were used for positive selection can be included.  

      We have included a supplemental file containing accession numbers, species names for evolutionary analysis and functional studies. In addition, this table includes the infection data for each IFIT1 homolog for the screen performed in figure 3.

      (4) The discussion of potential functions of the C-terminus of IFIT1 should include possible interactions with other proteins. In particular, the C-terminus of IFIT1 has been shown to interact with IFIT3 in a way that modulates its activity (PMID: 29525521). Although residues 362-366 were not shown in that paper to interact with a fragment of IFIT3, it is possible that these residues may be important for interaction with full-length IFIT3 or some other IFIT1 binding partner. 

      We thank the reviewer for their suggestion. We have expanded the discussion to explore the possibility that residues 364 and 366 of IFIT1 may be involved in IFIT1-IFIT3 interactions and consequently Cap0 RNA-binding and antiviral activity.

      (5) The quantification of the EMSAs should be described in more detail. In particular, from looking at the images shown in Figure 5A, it would appear that human and chimpanzee IFIT1 show similar degrees of probe shift, while the human R187H panel shows no shifting at all. However, the quantification shows chimpanzee IFIT1 as being statistically indistinguishable from human R187H. Additional information on how bands were quantified and whether they were normalized to unshifted RNA would be helpful in attempting to resolve this visual discordance. 

      EMSAs were quantified by determining Adj. Vol. Intensity in ImageLab (BioRad), which subtracts background signal, after imaging at the same exposure and SYBR Gold staining time. To determine Adj. Vol. Intensity, we drew a box (same size for each gel and lane for each replicate) for each lane above the free probe. These values were not normalized to unshifted RNA, however equal RNA was loaded. While the ANOVA shows no significant difference, between human R187H and chimpanzee IFIT1 band shift intensity, this is potentially due to the between group variance in the ANOVA. The increase in the AUC value for chimpanzee IFIT1 is 36.4% higher than R187H.

      The AUC of Adj. Vol. Intensity of human IFIT1 band shift is roughly 2-fold more than that of chimpanzee IFIT1. We believe this matches with the visual representation as well, as human IFIT1 has a darker “upper” band in the shift, as well as a clear dark “lower” band that is not well defined in the chimpanzee shift. Furthermore, the upper band of the chimpanzee IFIT1 shift appears to be as intense in the 400nM as the upper band in the 240nM human IFIT1 lane, without taking into account the lower band seen for human IFIT1 as well. We included this quantification as kD was unable to be calculated due to no clear probe disappearance and we do not intend for this quantification to act as a substitute for binding affinity calculations, rather to aid the reader in data interpretation.

      Reviewer #3 (Recommendations for the authors): 

      (1) IFIT1 has been demonstrated to function in conjunction with other IFIT proteins, do you think the absence of antiviral activity is due to isolated expression of IFIT1 without these cofactors, and therefore might explain why there was little overlap observed in orthologs that inhibited the viruses tested (Figure 3, lines 209-210). 

      We do not believe that isolated expression of IFIT1 without cofactors (such as orthologous IFIT proteins) would fully explain the disparities in antiviral activity as many IFIT1s that expressed inhibited either VSV or VEEV in our screen. However, we acknowledge that the expression of IFIT1 alone does create a limitation in our study as IFIT1 antiviral activity and RNA-binding can be modulated by interactions with other IFIT proteins. Therefore, we do believe that it is possible that co-expression of IFIT1 with other IFITs from a given species might potentially enhance antiviral activity. Future studies may shed light on this.

      (2) Figure 5 - Calculating the Kd for each protein would be more informative. How does the binding affinity of these IFIT1 proteins compare to that which has previously been reported? 

      We are unable to accurately determine kD as there is not substantial diminished signal of the free probe. Therefore, we are only able to compare IFIT1 protein binding between species without accurate mathematical calculation of binding affinity. Our result does appear similar to that of mouse Ifit1 binding to VEEV RNA (PMID: 24482115), in which the authors also do not calculate a kD for their RNA EMSA.

      (3) Mutants 364 and 366 may not have direct contact with RNA, but RNA EMSA data presented suggest that the binding affinity may be different (though this is hard to conclude without Kd data). Additional biochemical data with these mutants might provide more insight here. 

      We agree that further studies using 364 and 366 double mutant human and chimpanzee protein in EMSAs would provide additional biochemical data and provide insight into the role of these residues in direct RNA binding. We acknowledge this is a limitation of our study as we provide only genetic data demonstrating the importance of these residues.

      (4) Given that there appears to be very little overlap observed in orthologs that inhibited the viruses tested, it's possible that other amino acids may be key drivers of antiviral activity in these other orthologs. Thus, it's difficult to conclude whether the findings that residues 362/4/6 are important for IFIT1 activity can be broadly applied to other orthologs. A more systematic assessment of the role of these mutations across multiple diverse orthologs would provide more insight here. Do other antiviral proteins show this trend (ie exhibit little overlap in orthologs that inhibit these viruses). What do you think might be driving this? 

      We agree that other residues outside of 364 and 366 may be key drivers of antiviral activity across the IFTI1 orthologs tested. We do not hypothesize that this will broadly apply across IFIT1 from diverse clades of mammals as overall amino acid identity can differ by over 30%. However, based on the chimpanzee and human IFIT1 data, as well as sequence alignment within primates specifically, we believe these residues may be key for primate (but not necessarily other clades of mammals) IFIT1 antiviral activity.

      Regarding if other antiviral proteins show little overlap in orthologs that inhibit a given virus, to our knowledge such a functional study with this large and divergent dataset of orthologs has not been performed. However, there are many examples of restriction factors exhibiting speciesspecific antiviral activity when ortholog screens have been performed. For example, HIV was reported to be suppressed by MX2 orthologs from human, rhesus macaque, and African green monkey, but not sheep or dog MX2 (PMID: 24760893). In addition, foamy virus was inhibited by the human and rhesus macaque orthologs of PHF11, but not the mouse and feline orthologs (PMID: 32678836). Furthermore, studies from our lab have shown variability in RTP4 ortholog antiviral activity inhibition towards viruses much as hepatitis C virus (HCV), West Nile virus (WNV), and Zika virus (ZIKV) (PMID: 33113352).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      Weiss and co-authors presented a versatile probabilistic tool. aTrack helps in classifying tracking behaviors and understanding important parameters for different types of single particle motion types: Brownian, Confined, or Directed motion. The tool can be used further to analyze populations of tracks and the number of motion states. This is a stand-alone software package, making it user-friendly for a broad group of researchers. 

      Strengths: 

      This manuscript presents a novel method for trajectory analysis. 

      Weaknesses: 

      (1) In the results section, is there any reason to choose the specific range of track length for determining the type of motion? The starting value is fine, and would be short enough, but do the authors have anything to report about how much is too long for the model? 

      We chose to test the range of track lengths (five-to-hundreds of steps) to cover the broad range of scenarios arising from single proteins or fluorophores to brighter objects with more labels.  While there is no upper-limit per se, the computation time of our method scales linearly with track length, 100 time-points takes ~2 minutes to run on a standard consumer-level desktop CPU. We have added the following sentence to note the time-cost with trajectory length:  

      “The recurrent formula enables our model computation time to scale linearly with the number of time points.”

      (2) Robustness to model mismatches is a very important section that the authors have uplifted diligently. Understanding where and how the model is limited is important. For example, the authors mentioned the limitation of trajectory length, do the authors have any information on the trajectory length range at which this method works accurately? This would be of interest to readers who would like to apply this method to their own data. 

      We agree that limitations are important to estimate, and trajectory length is an important consideration when choosing how to analyze a dataset. We report the categorization certainty, i.e. the likelihood differences, for a range of track lengths (Fig. 2 a,c, Fig. 3c-d, and Fig. 4 c,g.).

      For example, here are the key plots from Fig. 2 quantifying the relative likelihoods, where being within the light region is necessary. The light areas represent a useful likelihood ratio.

      We only performed analysis up to track lengths of 600 time steps but parameter estimations and significance can only improve when increasing the track length as long as the model assumptions are verified. The broader limitations and future opportunities for new methods are now expanded upon in the discussion, for example switching between states and model and state and model ambiguities (bound vs very slow diffusion vs very slow motion).

      (3) aTrack extracts certain parameters from the trajectories to determine the motion types. However, it is not very clear how certain parameters are calculated. For example, is the diffusion coefficient D calculated from fitting, and how is the confinement factor defined and estimated, with equations? This information will help the readers to understand the principles of this algorithm.

      We apologize for the confusion. All the model parameters are fit using the maximum likelihood approach. To make this point clearer in the manuscript, we have made three changes:

      (1) We modified the following sentence to replace “determined” with "fit”:

      “Finally, Maximum Likelihood Estimation (MLE) is used to fit the underlying parameter value”

      (2) We added the following sentence in the main text :

      “In our model, the velocity is the characteristic parameter of directed motion and the confinement factor represents the force within a potential well. More precisely, the confinement factor $l$ is defined such that at each time step the particle position is updated by $l$ times the distance particle/potential well center (see the Methods section for more details).”.

      (3) We have added a new section in the methods, called Fitting Method, where we have added the explanation below:

      “For the pure Brownian model, the parameters are the diffusion coefficient and the localization error. For the confinement model, the parameters are the diffusion coefficient, the localization error, confinement factor, and the diffusion coefficientof the potential well. For the directed model, the parameters are the diffusion coefficient, the localization error, the initial velocity and the acceleration variance.

      These parameters are estimated using the maximum likelihood approach which consists in finding the parameters that maximize the likelihood. We realize this fitting step using gradient descent via a TensorFlow model. All the estimates presented in this article are obtained from a single set of initial parameters to demonstrate that the convergence capacity of aTrack is robust to the initial parameter values.”

      (4) The authors mentioned the scenario where a particle may experience several types of motion simultaneously. How do these motions simulated and what do they mean in terms of motion types? Are they mixed motion (a particle switches motion types in the same trajectory) or do they simply present features of several motion types? It is not intuitive to the readers that a particle can be diffusive (Brownian) and direct at the same time. 

      In the text, we present an example where one can observe this type of motion to help the reader understand when this type of motion can be met: “Sometimes, particles undergo diffusion and directed motion simultaneously, for example, particles diffusing in a flowing medium (Qian 1991).”

      This is simulated by the addition of two terms affecting the hidden position variable before adding a localization term to create the observed variable. In the analysis, this manifests as non-zero values for the diffusion coefficient and the linear velocity. For example, Figure 4g and the associated text, where a single particle moves with a directed component and a Brownian diffusion component at each step.

      We did not simulate transitions between types of motion. Switching is not treated by this current model; however, this limitation is described in the discussion and our team and others are currently working on addressing this challenge.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors present a software package "aTrack" for identification of motion types and parameter estimation in single-particle tracking data. The software is based on maximum likelihood estimation of the time-series data given an assumed motion model and likelihood ratio tests for model selection. They characterized the performance of the software mostly on simulated data and showed that it is applicable to experimental data. 

      Strengths: 

      A potential advantage of the presented method is its wide applicability to different motion types. 

      Weaknesses: 

      (1) There has been a lot of similar work in this field. Even though the authors included many relevant citations in the introduction, it is still not clear what this work uniquely offers. Is it the first time that direct MLE of the time-series data was developed? Suggestions to improve would include (a) better wording in the introduction section, (b) comparing to other popular methods (based on MSD, step-size statistics (Spot-On, eLife 2018;7:e33125), for example) using the simulated dataset generated by the authors, (c) comparing to other methods using data set in challenges/competitions (Nat. Comm (2021) 12:6253).  

      We thank the reviewer for this suggestion and agree that the explanation of the innovative aspects of our method in the introduction was not clear enough. We have now modified the introduction to better explain what is improved here compared to previous approaches.

      “The main innovations of this model are: 1) it uses analytical recurrence formulas to perform the integration step for complex motion, improving speed and accuracy; 2) it handles both confined and directed motion; 3) anomalous parameters, such as the center of the potential well and the velocity vector are allowed to change through time to better represent tracks with changing directed motion or confinement area; and lastly 4) for a given track or set of tracks, aTrack can determine whether tracks can be statistically categorized as confined or directed, and the parameters that best describe their behavior, for example, diffusion coefficient, radius of confinement, and speed of directed motion.”

      Regarding alternatives, we compare our method in the text to the best-performing algorithm of the

      2021 Anomalous Diffusion (AnDi) Challenge challenge mentioned by the reviewer in Figure 6 (RANDI, Argun et al, arXiv, 2021, Muñoz-Gil et al, Nat Com. 2021). Notably, both methods performed similarly on fBm, but ours was more robust in cases where there were small differences between the process underlying the data and the model assumptions, a likely scenario in real datasets. Regarding Spot-On, this was not mentioned as it only deals with multiple populations of Brownian diffusers, preventing a quantitative comparison.

      (2) The Hypothesis testing method presented here has a number of issues: first, there is no definition of testing statistics. Usually, the testing statistics are defined given a specific (Type I and/or Type II) error rate. There is also no discussion of the specificity and sensitivity of the testing results (i.e. what's the probability of misidentification of a Brownian trajectory as directed? etc).

      We now explain our statistical approach and how to perform hypothesis testing with our metric in a new supplementary section, Statistical test. 

      We use the likelihood ratio as a more conservative alternative to the p-value. In Fig S2, we show that our metric is an upper bound of the p-value and can be used to perform hypothesis testing with a chosen type I error rate. 

      Related, it is not clear what Figure 2e (and other similar plots) means, as the likelihood ratio is small throughout the parameter space. Also, for likelihood ratio tests, the authors need to discuss how model complexity affects the testing outcome (as more complex models tend to be more "likely" for the data) and also how the likelihood function is normalized (normalization is not an issue for MLE but critical for ratio tests). 

      We present the likelihood ratio as an upper bound of the p-value. Therefore, we can reject the null hypothesis if it is smaller than a given threshold, e.g. 0.05, but this number should be decreased if multiple tests are performed. The colorscale we show in the figure is meant to highlight the working range (light), and ambiguous range (dark) of the method.

      As the reviewer mentions, we expect the alternative hypothesis to result in higher likelihoods than the simpler null hypothesis for null hypothesis tracks, but, as seen in the Fig S2, the likelihood ratio of a dataset corresponding to the null hypothesis is strongly skewed toward its upper limit 1. This means that for most of the tracks, the likelihood is not (or little) affected by the model complexity. The likelihoods of all the models are normalized so their integrals over the data equals 1/A with A the area of the field of view which is independent of the model complexity.

      (3) Relating to the mathematical foundation (Figure 1b). The measured positions are drawn as direct arrows from the real position states: this infers instantaneous localization. In reality, there is motion blur which introduces a correlation of the measured locations. Motion blur is known to introduce bias in SPT analysis, how does it affect the method here? 

      The reviewer raises an important point as our model does not explicitly consider motion blur. We have now added a paragraph that presents how our model performs in case of motion blur in the section called Robustness to model mismatches. This section and the corresponding new Supplemental Fig. S7 demonstrate that the estimated diffusion length is accurate so long as the static localization error is higher than the dynamic localization error. If the dynamic localization error is higher, our model systematically underestimates the diffusion length by a factor 0.81 = (2/3)<sup>0.5</sup> which can be corrected for with an added post-processing step.  

      (4) The authors did not go through the interpretation of the figure. This may be a matter of style, but I find the figures ambiguous to interpret at times.  

      We thank the reviewer for their feedback on improving the readability. To avoid overly repetitive and lengthy sections of text, we have opted for a concise approach. This allows us to present closely related panels at the same point in the text, while not ignoring important variations and tests. Considering this feedback and the reviewers, we have added more information and interpretation throughout our manuscript to improve interpretability.

      (5) It is not clear to me how the classification of the 5 motion types was accomplished. 

      We have modified the specific text related to this figure to describe an illustrative example to show how one could use aTrack on a dataset where not that much is known: First, we present the method to determine the number of states; second, we verify the parameter estimates correspond to the different states.  

      Classifying individual tracks is possible. While not done in the section corresponding to Fig. 5, this is done in Fig. 7 and a new supplementary plot, Fig. S9b (shown below). In brief, this is accomplished with our method by computing the likelihood of each track given each state. The probability that a given track is in state k equals the likelihood of the track given the state divided by the sum of the likelihoods given the different states. 

      (6) Figure 3. Caption: what is ((d_{est}-0.1)/0.1)? Also panel labeled as "d" should be "e". 

      Thank you for bringing these errors to our attention, the panel and caption have been corrected.

      Reviewer #3 (Public Review): 

      Summary: 

      In this work, Simon et al present a new computational tool to assess non-Brownian single-particle dynamics (aTrack). The authors provide a solid groundwork to determine the motion type of single trajectories via an analytical integration of multiple hidden variables, specifically accounting for localization uncertainty, directed/confined motion parameters, and, very novel, allowing for the evolution of the directed/confined motion parameters over time. This last step is, to the best of my knowledge, conceptually new and could prove very useful for the field in the future. The authors then use this groundwork to determine the motion type and its corresponding parameter values via a series of likelihood tests. This accounts for obtaining the motion type which is statistically most likely to be occurring (with Brownian motion as null hypothesis). Throughout the manuscript, aTrack is rigorously tested, and the limits of the methods are fully explored and clearly visualised. The authors conclude with allowing the characterization of multiple states in a single experiment with good accuracy and explore this in various experimental settings. Overall, the method is fundamentally strong, wellcharacterised, and tested, and will be of general interest to the single-particle-tracking field. 

      Strengths: 

      (1) The use of likelihood ratios gives a strong statistical relevance to the methodology. There is a sharp decrease in likelihood ratio between e.g. confinement of 0.00 and 0.05 and velocity of 0.0 and 0.002 (figure 2c), which clearly shows the strength of the method - being able to determine 2nm/timepoint directed movement with 20 nm loc. error and 100 nm/timepoint diffusion is very impressive. 

      We apologize for the confusion, the directed tracks in Fig 2 have no Brownian-motion component, i.e. D=0. We have made this clearer in the main text. Specifically, this section of the text refers to a track in linear motion with 2 nm displacements per step. With 70 time points (69 steps), a single particle which moved from 138 nm with a localization error of 20 nm (95% uncertainty range of 80 nm) can be statistically distinguished from slow diffusive motion.

      In Fig. 4g, we explore the capabilities of our method to detect if a diffusive particle also has a directed motion component. 

      (2) Allowing the hidden variables of confinement and directed motion to change during a trajectory (i.e. the q factor) is very interesting and allows for new interpretations of data. The quantifications of these variables are, to me, surprisingly accurate, but well-determined. 

      (3) The software is well-documented, easy to install, and easy to use. 

      Weaknesses: 

      (1) The aTrack principle is limited to the motions incorporated by the authors, with, as far as I can see, no way to add new analytical non-Brownian motion. For instance, being able to add a dynamical stateswitching model (i.e. quick on/off switching between mobile and non-mobile, for instance, repeatable DNA binding of a protein), could be of interest. I don't believe this necessarily has to be incorporated by the authors, but it might be of interest to provide instructions on how to expand aTrack.  

      We agree that handling dynamic state switching is very useful and highlight this potential future direction in the discussion. The revised text reads:

      “An important limitation of our approach is that it presumes that a given track follows a unique underlying model with fixed parameters. In biological systems, particles often transition from one motion type to another; for example, a diffusive particle can bind to a static substrate or molecular motor (46). In such cases, or in cases of significant mislinkings, our model is not suitable. However, this limitation can be alleviated by implicitly allowing state transitions with a hidden Markov Model (15) or alternatives such as change-point approaches (30, 47, 48), and spatial approaches (49).”

      (2) The experimental data does not very convincingly show the usefulness of aTrack. The authors mention that SPBs are directed in mitosis and not in interphase. This can be quantified and studied by microscopy analysis of individual cells and confirming the aTrack direction model based on this, but this is not performed. Similarly, the size of a confinement spot in optical tweezers can be changed by changing the power of the optical tweezer, and this would far more strongly show the quantitative power of aTrack. 

      We agree with the reviewer and have revised the biological experiment section significantly to better illustrate the potential of aTrack in various use cases.

      Now, we show an experiment to quantify the effect of LatA, an actin inhibitor, on the fraction of directed tracks obtained with aTrack. We find that LatA significantly decreases directed motion while a LatA-resistant mutant is not affected (Fig7a-c).

      As suggested by the reviewer, we have expanded the optical tweezer experiment by varying the laser power. As expected, increasing the laser power decreases the confinement radius.

      (3) The software has a very strict limit on the number of data points per trajectory, which is a user input. Shorter trajectories are discarded, while longer trajectories are cut off to the set length. It is not explained why this is necessary, and I feel it deletes a lot of useful data without clear benefit (in experimental conditions).

      We thank the reviewer for this recommendation; we have now modified the architecture of our model to enable users to consider tracks of multiple lengths. Note that the computation time is proportional to the longest track length times the number of tracks.  

      Reviewer #2 (Recommendations For The Authors): 

      Develop a better mathematical foundation for the likelihood ratio tests. 

      We added more explanation of the likelihood ratio tests and their interpretation a new section entitled Statistical test in the supplementary information to address this recommendation.

      Place this work in clearer contexts. 

      We have now revised the introduction to better contextualize this work.

      Improve manuscript clarity. 

      Based on reviewer feedback and input from others, we have addressed this point throughout the article to improve readability.

      Make the code available. 

      The code is available on https://github.com/FrancoisSimon/aTrack, now including code for track generation.

      Reviewer #3 (Recommendations For The Authors): 

      (1) I believe the underlying model presented in Figure 1 is of substantial impact, especially when considering it as a simulation tool. I would suggest the authors make their method also available as a simulator (as far as I can tell, this is not explicitly done in their code repository, although logically the code required for the simulator should already be in the codebase somewhere). 

      Thank you for this suggestion, the simulation scripts are now on the Github repository together with the rest of the analysis method. https://github.com/FrancoisSimon/aTrack

      (2) The authors should explore and/or discuss the effects of wrong trajectory linking to their method. Throughout the text, fully correct trajectory linking is assumed and assessed, while in real experiments, it is often the case that trajectory linking is wrong, e.g. due to blinking emitters, imaging artefacts, high-density localizations, etc etc. This would have a major impact on the accuracy of trajectories, and it is extremely relevant to explore how this is translated to the output of aTrack. 

      As the reviewer notes, our current model does not account for track mislinking. This limits the method to data with lower fluorophore-densities, which is the typical use-case for SPT. We have added a brief description of the issue into the discussion of limitations.  

      (3) aTrack only supports 2D-tracking, but I don't believe there is a conceptual reason not to have this expanded to three dimensions. 

      The stand-alone software is currently limited to 2D tracks, however, the aTrack Python package works for any number of dimensions (i.e. 1-3). Note that since the current implementation assumes a single localization error for all axes, more modifications may be required for some types of 3D tracking. See https://github.com/FrancoisSimon/aTrack for more details about aTrack implementations.

      (4) Crucial information is missing in the experimental demonstrations. Especially in the NP-bacteria dataset, I miss scalebars, and information on the number of tracks. It is not explained why 5 different states are obtained - especially because I would naively expect three states: immobile NPs (e.g. stuck to glass), diffusing NPs, and NPs attached to bacteria, and thus directed. Figure 7e shows three diffusive states (why more than one?), no immobile states (why?), and two directed states (why?). 

      We thank the reviewer for pointing out these issues. We have now added scalebars and more experimental details to the figure and text as well as modifying the plot to more clearly emphasize the directed nanoparticles that are attached to cells from the diffusive nanoparticles.  

      Likely, our focal plane was too high to see the particles stuck on glass. The multiple diffusive states may be caused by different sizes of nanoparticle complexes, the multiple directed states can be caused by the fact that directed motion of the cell-attached-nanoparticles occasionally shows drastic changes of orientations. We have also clarified in the text how multiple states can help handle a heterogeneous population as was shown by Prindle et al. 2022, Microbiol Spectr. The characterization and phenotyping of microbial populations by nanoparticle tracking was published in Zapata et al. 2022, Nanoscale. 

      (5) I don't think I agree that 'robustness to model mismatches' is a good thing. Very crudely, the fact that aTrack finds fractional Brownian motion to be normal Brownian motion is technically a downside - and this should be especially carefully positioned if (in the future) a fractional Brownian motion model would be added to aTrack. I think that the author's point can be better tested by e.g. widely varying simulated vs fitted loc precision/diffusion coefficient (which are somewhat interchangeable).

      In this context, our intention in describing the robustness to “model mismatches” refers to classifying subdiffusion as subdiffusive irrespective of the exact subdiffusion motion physics (as well as superdiffusion), that is, to use aTrack how MSD analysis is often deployed. This is important in the context of real-world applications where simple mathematical models cannot perfectly represent real tracks with greater complexity. 

      Inevitably, some fraction of tracks with a pure Brownian motion may appear to match with a fractional Brownian motion, and thus statistical tests are needed to determine if this is significant. In general, aTrack finds fBm to be normal Brownian motion only when the anomalous coefficient is near 1, i.e. when the two models are indeed the same. When analysing fBm tracks with anomalous coefficients of 0.5 or 1.5, aTrack find that these tracks are better explained by our confined diffusion model or directed motion model, respectively (Please see Fig. 6a, copied below). 

      To better clarify our objective, the section now has a brief introduction that reads:

      “One of the most important features of a method is its robustness to deviations from its assumptions. Indeed, experimental tracking data will inevitably not match the model assumptions to some degree, and models need to be resilient to these small deviations.”  

      Smaller points: 

      (1) It is not clear what a biological example is of rotational diffusion. 

      We modified the text to better explain the use of rotational diffusion.

      (2) The text in the section on experimental data should be expanded and clarified, there currently are multiple 'floating sentences' that stop halfway, and it does not clearly describe the biological relevance and observed findings.  

      We thank the reviewer for pointing out this issue. We have reworked the experimental section to better and more clearly explain the biological relevance of the findings.

      (3) Caption of figure 3: 'd' should be 'e'. 

      (4) Caption of Figure 7: log-likelihood should be Lconfined - Lbrownian, I believe. 

      (5) Equation number missing in SI first sentence. 

      (6) Supplementary Figure 1 top part access should be Lc-Lb instead of Ld-Lb. 

      We have made these corrections, thank you for bringing them to our attention.

    1. Reviewer #2 (Public review):

      Summary:

      The authors report that Arabidopsis short HSFs S-HsfA2, S-HsfA4c, and S-HsfB1 confer extreme heat. They have truncated DNA binding domains that bind to a new heat-regulated element. Considering Short HSFA2, the authors have highlighted the molecular mechanism by which S-HSFs prevent HSR hyperactivation via negative regulation of HSP17.6B. The S-HsfA2 protein binds to the DNA binding domain of HsfA2, thus preventing its binding to HSEs, eventually attenuating HsfA2-activated HSP17.6B promoter activity. This report adds insights to our understanding of heat tolerance and plant growth.

      Strengths:

      (1) The manuscript represents ample experiments to support the claim.

      (2) The manuscript covers a robust number of experiments and provides specific figures and graphs to in support of their claim.

      (3) The authors have chosen a topic to focus on stress tolerance in changing environment.

      (4) The authors have summarized the probable mechanism using a figure.

      Weaknesses:

      Quite minimum

      (1) Fig. 3. the EMSA to reveal binding

      (2) Alignment of supplementary figures 6-7.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      In the present work, Chen et al. investigate the role of short heat shock factors (S-HSF), generated through alternative splicing, in the regulation of the heat shock response (HSR). The authors focus on S-HsfA2, an HSFA2 splice variant containing a truncated DNA-binding domain (tDBD) and a known transcriptional-repressor leucin-rich domain (LRD). The authors found a two-fold effect of S-HsfA2 on gene expression. On the one hand, the specific binding of S-HsfA2 to the heat-regulated element (HRE), a novel type of heat shock element (HSE), represses gene expression. This mechanism was also shown for other S-HSFs, including HsfA4c and HsfB1. On the other hand, S-HsfA2 is shown to interact with the canonical HsfA2, as well as with a handful of other HSFs, and this interaction prevents HsfA2 from activating gene expression. The authors also identified potential S-HsfA2 targets and selected one, HSP17.6B, to investigate the role of the truncated HSF in the HSR. They conclude that S-HsfA2-mediated transcriptional repression of HSP17.6B helps avoid hyperactivation of the HSR by counteracting the action of the canonical HsfA2.

      The manuscript is well written and the reported findings are, overall, solid. The described results are likely to open new avenues in the plant stress research field, as several new molecular players are identified. Chen et al. use a combination of appropriate approaches to address the scientific questions posed. However, in some cases, the data are inadequately presented or insufficient to fully support the claims made. As such, the manuscript would highly benefit from tackling the following issues:

      (1) While the authors report the survival phenotypes of several independent lines, thereby strengthening the conclusions drawn, they do not specify whether the presented percentages are averages of multiple replicates or if they correspond to a single repetition. The number of times the experiment was repeated should be reported. In addition, Figure 7c lacks the quantification of the hsp17.6b-1 mutant phenotype, which is the background of the knock-in lines. This is an essential control for this experiment

      For the seedling survival rates and gene expression levels, we added statistical analysis based on at least two independent experiments. Figure 6E of the revised manuscript shows the phenotypes of the WT, hsp17.6b-1, HSP17.6B-KI, and HSP17.6B-OE plants and the statistical analysis of their seedling survival rates after heat exposure.

      (2) In Figure 1c, the transcript levels of HsfA2 splice variants are not evident, as the authors only show the quantification of the truncated variant. Moreover, similar to the phenotypes discussed above, it is unclear whether the reported values are averages and, if so, what is the error associated with the measurements. This information could explain the differences observed in the rosette phenotypes of the S-HsfA2-KD lines. Similarly, the gene expression quantification presented in Figures 4 and 5, as well as the GUS protein quantification of Figure 3F, also lacks this crucial information.

      RT‒qPCR analysis of the expression of these genes from at least two independent experiments was performed. We also added these missing information to the figure legends.

      (3) The quality of the main figures is low, which in some cases prevents proper visualization of the data presented. This is particularly critical for the quantification of the phenotypes shown in Figure 1b and for the fluorescence images in Figures 4f and 5b. Also, Figure 9b lacks essential information describing the components of the performed experiments.

      We apologize; owing to the limitations of equipment and technology, we will attempt to obtain high-quality images in the future. A detailed description of Figure 9b is provided in the methods section.

      (4) Mutants with low levels of S-HsfA2 yield smaller plants than the corresponding wild type. This appears contradictory, given that the proposed role of this truncated HSF is to counteract the growth repression induced by the canonical HSF. What would be a plausible explanation for this observation? Was this phenomenon observed with any of the other tested S-HSFs?

      We found that the constitutive expression of S-HsfA2 inhibits Arabidopsis growth. Considering this, Arabidopsis plants do not produce S-HsfA2 under normal conditions to avoid growth inhibition. However, under heat stress, Arabidopsis plants generate S-HsfA2, which contributes to heat tolerance and growth balance. In the revised manuscript, we provided supporting data indicating that S-HsfA4c-GFP or S-HsfB1-RFP constitutive expression confers Arabidopsis extreme heat stress sensitivity but inhibits root growth (Supplemental Figure S8). Therefore, this phenomenon is also observed in S-HsfA4c-GFP or S-HsfB1-RFP.

      (5) In some cases, the authors make statements that are not supported by the results:<br /> (i) the claim that only the truncated variant expression is changed in the knock-down lines is not supported by Figure 1c;

      In three S-HsfA2-KD lines, RT‒PCR splicing analysis revealed that HsfA2-II but not HsfA2-III is easily detected. In the revised manuscript, we added RT‒qPCR analysis, and the results revealed that the abundance of HsfA2-III and HsfA2-II but not that of the full-length HsfA2 mRNA significantly decreased under extreme heat (Figure 1C). Considering that HsfA2-III but not HsfA2-II is a predominant splice variant under extreme heat (Liu et al., 2013), S-HsfA2-KD may lead to the knockdown of alternative HsfA2 splicing transcripts, especially HsfA2-III.

      (ii) the increase in GUS signal in Figure 3a could also result from local protein production;

      We included this possibility in the results analysis.

      (iii) in Figure 6b, the deletion of the HRE abolishes heat responsiveness, rather than merely altering the level of response; and

      In the revised manuscript, we added new data concerning the roles of HREs and HSEs in the response of the HSP17.6B promoter to heat stress (Figure 6A). These results suggest that the HRE and HSE elements are responsible for the response of the HSP17.6B promoter to heat stress and that the HRE negatively regulates the HSP17.6B promoter at 37°C, whereas the HSE is positive at 42°C.

      (iv) the phenotypes in Figure 8b are not clear enough to conclude that HSP17.6B overexpressors exhibit a dwarf but heat-tolerant phenotype.

      When grown in soil, the HSP17.6B-OE seedlings presented a dwarf phenotype compared with the WT control. Heat stress resulted in browning of the WT leaves, but the leaves of the HSP17.6B-OE plants remained green, suggesting that the HSP17.6B-OE seedlings also presented a heat-tolerant phenotype in the soil. These results are qualitative but not quantitative experimental data; therefore, the conclusions are adjusted in the abstract and results sections.

      Reviewer #2 (Public review):

      Summary:

      The authors report that Arabidopsis short HSFs S-HsfA2, S-HsfA4c, and S-HsfB1 confer extreme heat. They have truncated DNA binding domains that bind to a new heat-regulated element. Considering Short HSFA2, the authors have highlighted the molecular mechanism by which S-HSFs prevent HSR hyperactivation via negative regulation of HSP17.6B. The S-HsfA2 protein binds to the DNA binding domain of HsfA2, thus preventing its binding to HSEs, eventually attenuating HsfA2-activated HSP17.6B promoter activity. This report adds insights to our understanding of heat tolerance and plant growth.

      Strengths:

      (1) The manuscript represents ample experiments to support the claim.

      (2) The manuscript covers a robust number of experiments and provides specific figures and graphs in support of their claim.

      (3) The authors have chosen a topic to focus on stress tolerance in a changing environment.

      Weaknesses:

      (1) One s-HsfA2 represents all the other s-Hsfs; S-HsfA4c, and S-HsfB1. s-Hsfs can be functionally different. Regulation may be positive or negative. Maybe the other s-hsfs may positively regulate for height and be suppressed by the activity of other s-hsfs.

      In this study, we used S-HsfA2, S-HsfA4c, and S-HsfB1 data to support the view that “splice variants of HSFs generate new plant HSFs”. We also noted that S-HsfA2 cannot represent a traditional S-HSF. S-HsfA4c and S-HsfB1 may have functions other than S-HsfA2 because of their different C-terminal motifs or domains. Different S-HSFs might participate in the same biological process, such as heat tolerance, through the coregulation of downstream genes. We added this information to the discussion section.

      (2) Previous reports on gene regulations by hsfs can highlight the mechanism.

      In the introduction section, we included these references concerning HSFs and S-HSFs.

      (3) The Materials and Methods section could be rearranged so that it is based on the correct flow of the procedure performed by the authors.

      The materials and methods and results sections are arranged in the logical order.

      (4) Graphical representation could explain the days after sowing data, to provide information regarding plant growth.

      The days after sowing (DAS) for the age of the Arabidopsis seedlings are stated in the Materials and Methods section and figure legends.

      (5) Clear images concerning GFP and RFP data could be used.

      We provided high-quality images of S-HsfA2-GFP and the GFP control (Figure 3 in the revised manuscript).

      Reviewing Editor comments:

      The EMSA shown in Figures 2, 3, 4, and 5, which are critical to support the manuscript's claims, are of poor quality, without any repeats to support. In addition, there is not much information about how these EMSA were done. I suggest including better EMSA in a new version of this manuscript.

      Thank you for your suggestion. We added the missing information, including the detailed EMSA method and experiment repeat times in the methods section and figure legends. We provide high-quality images of HRE probes binding to nuclear proteins (Figure 4E).

      Reviewer  #1 (Recommendations for the authors):

      (1) The paper is overall well-written, but it could greatly benefit from reorganizing the results subsections. Currently, there are entire subsections dedicated to supplementary figures (e.g., lines 177-191) and main figures split into different subsections (e.g., lines 237-246). It is recommended to organize all the information related to a main figure into a single subsection and to incorporate the description of the corresponding supplementary figures. This would imply a general reorganization of the figures, moving some information to the supplementary data (for instance, the data in Figure 4 could be supplementary to Figure 5) and vice versa (Supplementary Figure 4 should be incorporated into main Figure 2, as it presents very important results). Also, Figures 7 and 8 would be better presented if merged into a single figure/subsection.

      Thank you for your suggestion. We have merged some figures into a single figure according to the main information. In the current version, there are 8 main figures, which includes a new figure.

      (2) Survival phenotypes vary widely, making reliable statistical analysis challenging. The chlorophyll and fresh weight quantifications presented in figures such as Figure 5 appear to effectively describe the phenomenon and allow for statistical comparisons. Figures 1 and 7 would benefit from including these measurements if the variability in survival percentages is too high to calculate statistical differences reliably. Also, in Figure 8, all chlorophyll measurements should be normalized to fresh weight rather than seedling number due to the dwarfism observed in the overexpressor lines.

      Thank you for pointing out your concerns. We added statistical analysis based on at least two independent experiments, including Figures 1 and 7, to the original manuscript. In Figure 8 in the original manuscript, chlorophyll measurements were normalized to fresh weight.

      (3) Typos: in Figure 3a it should be "min" not "mim"; in Supplementary Figure 3, the GFP and merge images are swapped.

      We apologize for these errors, and we have corrected them. Supplementary Figure 3 was replaced with new images and was included in Figure 3 in the revised manuscript.

      Reviewer  #2 (Recommendations for the authors):

      (1) The abstract states "How this process is prevented to ensure proper plant growth has not been determined." The authors can be the first to do this, by adding graphical data on the height difference in hSfA2-arabidopis and wild-type Arabidopsis.

      Thank you and agree with you. We have added this information to the new working model (Figure 8)

      (2) The authors claim that Arabidopsis S-HsfA2, S-HsfA4c, and S-HsfB1; but have used S-HsfA2 to understand the action. The mechanisms being unknown for S-HsfA4c, and S-HsfB1 cannot be represented by S-HsfA2 to represent the mechanism.

      Thank you for your valuable comments. In this study, we used S-HsfA2, S-HsfA4c, and S-HsfB1 data to support the view that “splice variants of HSFs generate new plant HSFs”. We also noted that S-HsfA2 cannot represent a traditional S-HSF because S-HsfA4c and S-HsfB1 may have functions other than S-HsfA2. Therefore, we deleted “representative S-HSF” from the revised manuscript. In the future, we will conduct in-depth research on the relevant mechanisms of S-HsfA4c and S-HsfB1 under your guidance.

      (3) The authors can include which of the HSFs interacted with other genes of Arabidopsis reported by other researchers are positively or negatively regulated in heat response/ growth or the balance.

      In the introduction section, we included these genes. AtHsfA2, AtHsfA3, and BhHsf1 confer heat tolerance in Arabidopsis but also result in a dwarf phenotype in plants (Ogawa et al., 2007; Yoshida et al., 2008; Zhu et al., 2009).

      (4) The authors have started from the subsection plant materials and growth conditions. It is unclear from where the authors have found these HSF mutant Arabidopsis? Is it a continuation of some other work? As a reader, I am utterly confused because of the arrangement of the materials and methods section.

      We apologize for the lack of detailed information in the Materials and methods section. These mutants were purchased from AraShare (Fuzhou, China) and verified via PCR and RT‒qPCR. We added the missing information.

      (5) Is the DAS - Days After Sowing - represented as a graph or table? This will add data to the plant growth section to clearly state the difference between the mutants and the wild-type.

      In this study, the age of the Arabidopsis seedlings was calculated as days after sowing (DAS), as stated in the Materials and Methods section and figure legends.

      (6) Heat stress treatment after gus staining looks absurd. Should it not follow after plant materials and growth conditions, which should ideally be after the plant transformation and cloning section? The initial step is definitely about plasmid construction. Kindly rearrange.

      Thank you for your valuable suggestions. We have rearranged the logical order of the materials and methods.

      (7) The expression of GFP and RFP was not clearly seen in the images. This could be because of the poor resolution of the images added.

      We obtained high-quality images of S-HsfA2-GFP (Figure 3 in the revised manuscript).

      (8) We live in an age where it is widely known that genes are not functioning independently but are coregulated and coregulate other proteins. The authors can address the role of these spliced variants on gene regulation and compare them with the HSFs.

      We agree with your suggestion. In this study, HSP17.6B was identified as a direct gene of S-HsfA2 and HsfA2, which can partly explain the role of S-HsfA2 in heat resistance and growth balance. However, the mechanical mechanism by which S-HsfA2 regulates heat tolerance and growth balance may not be limited to HSP17.6B. On the basis of the current data, we propose that the putative S-HsfA2-DERB2A-HsfA3 module might be associated with the roles of S-HsfA2 in heat tolerance and growth balance. Please refer to the discussion section for a detailed explanation.

      (9) Regulatory elements can be validated in relation to their interaction with proven HSFs.

      Supplemental Figure S3 shows that His6-HsfA2 failed to bind to the HRE in vitro.

      (10) The authors seem to be biased toward heat stress and have not worked enough on plant growth. Biochemical data and images on plant growth could be added to bring out the novelty of this manuscript.

      Thank you for your suggestion. We added new data indicating that, compared with the wild-type control, S-HsfA2-GFP, S-HsfA4c-GFP, or S-HsfB1-GFP overexpression inhibited root length (Supplemental Figure 8).

      (11) Line 251 on page 11 of the submitted manuscript says that the s-Hsfs were previously identified by Liu et al. (2013) yet in the abstract the authors claim that these s-HsFs are NEW kinds of HSF with a unique truncated DNA-binding domain (tDBD) that binds a NEW heat-regulated element (HRE).

      In our previous report, several S-HSFs, including S-HsfA2, S-HsfA4, S-HsfB1, and S-HsfB2a, were identified primarily in Arabidopsis (Liu et al., 2013). In this study, we further characterized S-HsfA2, S-HsfA4, and S-HsfB1 and revealed several features of S-HSFs. Therefore, we claim that these S-HSFs are new kinds of HSFs.

      (12) What are these NEW kinds of HRE? Which genes have these HRE? Was an in silico study conducted to study it or can any reports can be cited?

      HREs, i.e., heat-regulated elements, are newly identified heat-responsive elements in this study. The sequences of HREs are partially related to traditional heat shock elements (HSEs). Because we did not identify the essential nucleic acids required for t-DBD binding to the HRE, we did not perform an in silico study.

      (13) S-HSFs may interact with existing HSFs. Have the authors thought in this direction? It can have a role in positively regulating other sHSFs or regulating multiple expressing genes related to plant growth and other functions. This needs to be explored.

      Thank you for this point. Given that the overexpression of Arabidopsis HsfA2 or HsfA3 inhibits growth under nonstress conditions, we discussed this direction from the perspective of the interaction of S-HsfA2 with HsfA2 or HsfA3 in the revised manuscript.

      (14) The authors need to concentrate on the presentation and arrangement of both their materials and methods and result section and write them in a systematic manner (or following a workflow).

      The materials, methods and results sections are arranged in logical order.

      (15) The authors have used references in the results section which can be added to the discussion section to make it more accurate.

      Thank you for your suggestions. We have moved some references to the discussion section, but the necessary references remain in the results section.

    1. Critica 1: "Onwuegbuzie y Leech (2005) añaden que actualmente los estudios que utilizan un solo método de investigación se convierten en una amenaza para el adelanto de las Ciencias Sociales".

      Se menciona que si un estudio es llevado a cabo bajo una sola metodología de investigación se considera una amenaza para las ciencias sociales. Pero yo creo que si un estudio se lleva a cabo bajo una sola metodología es porque esa metodología es la indicada para ejecutarse y cumplir los objetivos que se requieren.

      Critica 2: "Brown (2014) afirma que tomar decisiones correctas con respecto al tipo de instrumentos que se van a utilizar en una investigación puede requerir de mucho tiempo y consideración. Por otro lado, a más de decidir qué instrumentos darán a conocer información confiable, es necesario buscar la forma de reunir datos que finalmente sean fáciles y eficientes al momento de recopilarlos y analizarlos. "

      Critica 3: "Seidman (1998) recomienda a los investigadores que administren una secuencia de tres entrevistas con los mismos participantes para obtener suficiente información. La primera entrevista se aplica con el propósito de romper el hielo y crear un ambiente de empatía, a la vez que se realiza un rápido barrido de las áreas que serán investigadas posteriormente".

      Al momento de realizar 2 o más entrevistas a los mismos participantes, se puede llegar a dar el caso de que los participantes se empiecen a presentar fatiga por el tiempo que lleva respondiendo preguntas, es por esto que esta opción a mí consideración puede que no sea muy eficiente dependiendo el tipo de persona.

      Critica 4: " Luego del respectivo análisis de las transcripciones de las dos primeras entrevistas, el propósito de la tercera, según Seidman (1998), es cubrir ciertos temas que no fueron tratados en las sesiones previas"

      El realizar una tercera entrevista para abaracra temas que no fueron tocados creo que no es la mejor opción, ya que el tema que no fue abordado se pudiera abordar en las 2 entrevistas anteriores para evitar una sobrecarga en el entrevistado.

      Critica 5: "Mackey y Gass (2005) recomiendan a los investigadores novatos o principiantes que antes de conducir un análisis estadístico de datos, se familiaricen con la estadística de su estudio a través de cursos, textos o consultorías con expertos en el área".

      En esta parte el autor tiene razón en que los investigadores novatos que trabajan con datos estadisiticos previamente indaguen en la estadistica para una mejor interpretación de datos.

    1. Reviewer #4 (Public review):

      Summary:

      Several behavioral experiments and one TMS experiment were performed to examine adaptation to room reverberation for speech intelligibility in noise. This is an important topic that has been extensively studied by several groups over the years. And the study is unique in that it examines one candidate brain area, dlPFC, potentially involved in this learning, and finds that disrupting this area by TMS results in a reduction in the learning. The behavioral conditions are in many ways similar to previous studies. However, they find results that do not match previous results (e.g., performance in anechoic condition is worse than in reverberation), making it difficult to assess the validity of the methods used. One unique aspect of the behavioral experiments is that Ambisonics was used to simulate the spaces, while headphone simulation was mostly used previously. The main behavioral experiment was performed by interleaving 3 different rooms and measuring speech intelligibility as a function of the number of words preceding the target in a given room on a given trial. The findings are that performance improves on the time scale of seconds (as the number of words preceding the target increases), but also on a much larger time scale of tens to hundreds of seconds (corresponding to multiple trials), while for some listeners it is degraded for the first couple of trials. The study also finds that the performance is best in the room that matches the T60 most commonly observed in everyday environments. These are potentially interesting results. However, there are issues with the design of the study and analysis methods that make it difficult to verify the conclusions based on the data.

      Strengths:

      (1) Analysis of the adaptation to reverberation on multiple time scales, for multiple reverberant and anechoic environments, and also considering contextual effects of one environment interleaved with the other two environments.

      (2) TMS experiment showing reduction of some of the learning effects by temporarily disabling the dlPFC.

      Weaknesses:

      While the study examines the adaptation for different carrier lengths, it keeps multiple characteristics (mainly talker voice and location) fixed in addition to reverberation. Therefore, it is possible that the subjects adapt to other aspects of the stimuli, not just to reverberation. A condition in which only reverberation would switch for the target would allow the authors to separate these confounding alternatives. Now, the authors try to address the concerns by indirect evidence/analyses. However, the evidence provided does not appear sufficient.

      The authors use terms that are either not defined or that seem to be defined incorrectly. The main issue then is the results, which are based on analysis of what the authors call d', Hit Rate, and Final Hit rate. First of all, they randomly switch between these measures. Second, it's not clear how they define them, given that their responses are either 4-alternative or 8-alternative forced choice. d', Hit Rate, and False Alarm Rate are defined in Signal detection theory for the detection of the presence of a target. It can be easily extended to a 2-alternative forced choice. But how does one define a Hit, and, in particular, a False Alarm, in a 4/8-alternative? The authors do not state how they did it, and without that, the computation of d' based on HR and FAR is dubious. Also, what the authors call Hit Rate, is presumably the percent correct performance (PCC), but even that is not clear. Then they use FHR and act as if this was the asymptotic value of their HR, even though in many conditions their learning has not ended, and randomly define a variable of +-10 from FHR, which must produce different results depending on whether the asymptote was reached or not. Other examples of usage of strange usage of terms: they talk about "global likelihood learning" (L426) without a definition or a reference, or about "cumulative hit rate" (L1738), where it is not clear to me what "cumulative" means there.

      There are not enough acoustic details about the stimuli. The authors find that reverberant performance is overall better than anechoic in 2 rooms. This goes contrary to previous results. And the authors do not provide enough acoustic details to establish that this is not an artefact of how the stimuli were normalized (e.g., what were the total signal and noise levels at the two ears in the anechoic and reverberant conditions?).

      There are some concerns about the use of statistics. For example, the authors perform two-way ANOVA (L724-728) in which one factor is room, but that factor does not have the same 3 levels across the two levels of the other factor. Also, in some comparisons, they randomly select 11 out of 22 subjects even though appropriate test correct for such imbalances without adding additional randomness of whether the 11 selected subjects happened to be the good or the bad ones.

      Details of the experiments are not sufficiently described in the methods (L194-205) to be able to follow what was done. It should be stated that 1 main experiment was performed using 3 rooms, and that 3 follow-ups were done on a new set of subjects, each with the room swapped.

    2. Reviewer #3 (Public review):

      Summary:

      This manuscript presents a well-designed and insightful behavioural study examining human adaptation to room acoustics, building on prior work by Brandewie & Zahorik. The psychophysical results are convincing and add incremental but meaningful knowledge to our understanding of reverberation learning. However, I find the transcranial magnetic stimulation (TMS) component to be over-interpreted. The TMS protocol, while interesting, lacks sufficient anatomical specificity and mechanistic explanation to support the strong claims made regarding a unique role of the dorsolateral prefrontal cortex (dlPFC) in this learning process. More cautious interpretation is warranted, especially given the modest statistical effects, the fact that the main TMS result of interest is a null result, the imprecise targeting of dlPFC (which is not validated), and the lack of knowledge about the timescale of TMS effects in relation to the behavioural task. I recommend revising the manuscript to shift emphasis toward the stronger behavioural findings and to present a more measured and transparent discussion of the TMS results and their limitations.

      Strengths:

      (1) Well-designed acoustical stimuli and psychophysical task.

      (2) Comparisons across room combinations are well conducted.

      (3) The virtual acoustic environment is impressive and applied well here.

      (4) A timely study with interesting behavioural results.

      Weaknesses:

      (1) Lack of hypotheses, particularly for TMS.

      (2) Lack of evidence for targeting TMS in [brain] space and time.

      (3) The most interesting effect of TMS is a null result compared to a weak statistical effect for "meta adaptation"

    3. Reviewer #2 (Public review):

      Summary:

      This study investigated how listeners adapt to and utilize statistical properties of different acoustic spaces to improve speech perception. The researchers used repetitive TMS to perturb neural activity in DLPFC, inhibiting statistical learning compared to sham conditions. The authors also identified the most effective room types for the effective use of reverberations in speech in noise perception, with regular human-built environments bringing greater benefits than modified rooms with lower or higher reverberation times.

      Strengths:

      The introduction and discussion sections of the paper are very interesting and highlight the importance of the current study, particularly with regard to the use of ecologically valid stimuli in investigating statistical learning. However, they could be condensed into parts. TMS parameters and task conditions were well-considered and clearly explained.

      Weaknesses

      (1) The Results section is difficult to follow and includes a lot of detail, which could be removed. As such, it presents as confusing and speculative at times.

      (2) The hypotheses for the study are not clearly stated.

      (3) Multiple statistical models are implemented without correcting the alpha value. This leaves the analyses vulnerable to Type I errors.

      (4) It is confusing to understand how many discrete experiments are included in the study as a whole, and how many participants are involved in each experiment.

      (5) The TMS study is significantly underpowered and not robust. Sample size calculations need further explanation (effect sizes appear to be based on behavioural studies?). I would caution an exploratory presentation of these data, and calculate a posteriori the full sample size based on effect sizes observed in the TMS data.

    1. Reviewer #2 (Public review):

      Summary:

      The study by Rowley and Sedigh-Sarvestani presents modeling data suggesting that map reversals in mouse lateral extrastriate visual cortex do not coincide with areal borders, but instead represent borders between subregions within a single area V2. The authors propose that such an organization explains the partial coverage in higher-order areas reported by Zhuang et al., (2017). The scheme revisits an organization proposed by Kaas et al., (1989), who interpreted the multiple projection patches traced from V1 in the squirrel lateral extrastriate cortex as subregions within a single area V2. Kaas et al's interpretation was challenged by Wang and Burkhalter (2007), who used a combination of topographic mapping of V1 connections and receptive field recordings in mice. Their findings supported a different partitioning scheme in which each projection patch mapped a specific topographic location within single areas, each containing a complete representation of the visual field. The area map of mouse visual cortex by Wang and Burkhalter (2007) has been reproduced by hundreds of studies and has been widely accepted as ground truth (CCF) (Wang et al., 2020) of the layout of rodent cortex. In the meantime, topographic mappings in marmoset and tree shew visual cortex made a strong case for map reversals in lateral extrastriate cortex, which represent borders between functionally diverse subregions within a single area V2. These findings from non-rodent species raised doubts about whether during evolution, different mammalian branches have developed diverse partitioning schemes of the cerebral cortex. Rowley and Sedigh-Sarvestani favor a single master plan in which, across evolution, all mammalian species have used a similar blueprint for subdividing the cortex.

      Strengths:

      The story illustrates the enduring strength of science in search of definitive answers.

      Weaknesses:

      To me, it remains an open question whether Rowley and Sedigh-Sarvestani have written the final chapter of the saga. A key reason for my reservation is that the areas the maps used in their model are cherry-picked. The article disregards published complementary maps, which show that the entire visual field is represented in multiple areas (i.e. LM, AL) of lateral extrastriate cortex and that the map reversal between LM and AL coincides precisely with the transition in m2AChR expression and cytoarchitecture (Wang and Burkhalter, 2007; Wang et al., 2011). Evidence from experiments in rats supports the gist of the findings in the mouse visual cortex (Coogan and Burkhalter, 1993).

      (1) The selective use of published evidence, such as the complete visual field representation in higher visual areas of lateral extrastriate cortex (Wang and Burkhalter, 2007; Wang et al., 2011) makes the report more of an opinion piece than an original research article that systematically analyzes the area map of mouse visual cortex we have proposed. No direct evidence is presented for a single area V2 with functionally distinct subregions.

      (2) The article misrepresents evidence by commenting that m2AChR expression is mainly associated with the lower field. This is counter to published findings showing that m2AChR spans across the entire visual field (Gamanut et al., 2018; Meier et al., 2021). The utility of markers for delineating areal boundaries is discounted, without any evidence, in disregard of evidence for distinct areal patterns in early development (Wang et al., 2011). Pointing out that markers can be distributed non-uniformly within an area is well-familiar. m2AChR is non-uniformly expressed in mouse V1, LM and LI (Ji et al., 2015; D'Souza et al., 2019; Meier et al., 2021). Recently, it has been found that the patchy organization within V1 plays a role in the organization of thalamocortical and intracortical networks (Meier et al., 2025). m2AChR-positive patches and m2AChR-negative interpatches organize the functionally distinct ventral and dorsal networks, notably without obvious bias for upper and lower parts of the visual field.

      (3) The study has adopted an area partitioning scheme, which is said to be based on anatomically defined boundaries of V2 (Zhuang et al., 2017). The only anatomical borders used by Zhuang et al. (2017) are those of V1 and barrel cortex, identified by cytochrome oxidase staining. In reality, the partitioning of the visual cortex was based on field sign maps, which are reproduced from Zhuang et al., (2017) in Figure 1A. It is unclear why the maps shown in Figures 2E and 2F differ from those in Figure 1A. It is possible that this is an oversight. But maintaining consistent areal boundaries across experimental conditions that are referenced to the underlying brain structure is critical for assigning modeled projections to areas or sub-regions. This problem is evident in Figure 2F, which is presented as evidence that the modeling approach recapitulates the tracings shown in Figure 3 of Wang and Burkhalter (2007). The dissimilarities between the modeling and tracing results are striking, unlike what is stated in the legend of Figure 2F.

      (4) The Rowley and Sedigh-Sarvestani find that the partial coverage of the visual field in higher order areas shown by Zhuang et al (2017) is recreated by the model. It is important to caution that Zhuang et al's (2017) maps were derived from incomplete mappings of the visual field, which was confined to -25-35 deg of elevation. This underestimates the coverage we have found in LM and AL. Receptive field mappings show that LM covers 0-90 deg of azimuth and -30-80 elevation (Wang and Burkhalter, 2007). AL covers at least 0-90 deg of azimuth and -30-50 deg of elevation (Wang and Burkhalter, 2007; Wang et al., 2011). These are important differences. Partial coverage in LM and AL underestimates the size of these areas and may map two projection patches as inputs to subregions of a single area rather than inputs to two separate areas. Complete, or nearly complete, visual representations in LM and AL support that each is a single area. Importantly, both areas are included in a callosal-free zone (Wang and Burkhalter, 2007). The surrounding callosal connections align with the vertical meridian representation. The single map reversal is marked by a transition in m2AChR expression and cytoarchitecture (Wang et al., 2011).

      (5) The statement that the "lack of visual field overlap across areas is suggestive of a lack of hierarchical processing" is predicated on the full acceptance of the mappings by Zhuang et al (2017). Based on the evidence reviewed above, the reclassification of visual areas proposed in Figure 1C seems premature.

      (6) The existence of lateral connections is not unique to rodent cortex and has been described in primates (Felleman and Van Essen, 1991).

      (7) Why the mouse and rat extrastriate visual cortex differ from those of many other mammals is unclear. One reason may be that mammals with V2 subregions are strongly binocular.

    2. Author response:

      eLife Assessment:

      This paper performs a valuable critical reassessment of anatomical and functional data, proposing a reclassification of the mouse visual cortex in which almost all the higher visual areas are consolidated into a single area V2. However, the evidence supporting this unification is incomplete, as the key experimental observations that the model attempts to reproduce do not accurately reflect the literature . This study will likely be of interest to neuroscientists focused on the mouse visual cortex and the evolution of cortical organization.

      We do not agree or understand which 'key experimental observations' that the model attempts to reproduce do not accurately reflect the literature. The model reproduces a complete map of the visual field, with overlap in certain regions. When reversals are used to delineate areas, as is the current custom, multiple higher order areas are generated, and each area has a biased and overlapping visual field coverage. These are the simple outputs of the model, and they are consistent with the published literature, including recent publications such as Garrett et al. 2014 and Zhuang et al. 2017, a paper published in this journal. The area boundaries produced by the model are not identical to area boundaries in the literature, because the model is a simplification.

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors argue that defining higher visual areas (HVAs) based on reversals of retinotopic tuning has led to an over-parcellation of secondary visual cortices. Using retinotopic models, they propose that the HVAs are more parsimoniously mapped as a single area V2, which encircles V1 and exhibits complex retinotopy. They reanalyze functional data to argue that functional differences between HVAs can be explained by retinotopic coverage. Finally, they compare the classification of mouse visual cortex to that of other species to argue that our current classification is inconsistent with those used in other model species.

      Strengths:

      This manuscript is bold and thought-provoking, and is a must-read for mouse visual neuroscientists. The authors take a strong stance on combining all HVAs, with the possible exception of area POR, into a single V2 region. Although I suspect many in the field will find that their proposal goes too far, many will agree that we need to closely examine the assumptions of previous classifications to derive a more accurate areal map. The authors' supporting analyses are clear and bolster their argument. Finally, they make a compelling argument for why the classification is not just semantic, but has ramifications for the design of experiments and analysis of data.

      Weaknesses:

      Although I enjoyed the polemic nature of the manuscript, there are a few issues that weaken their argument.

      (1) Although the authors make a compelling argument that retinotopic reversals are insufficient to define distinct regions, they are less clear about what would constitute convincing evidence for distinct visual regions. They mention that a distinct area V3 has been (correctly) defined in ferrets based on "cytoarchitecture, anatomy, and functional properties", but elsewhere argue that none of these factors are sufficient to parcellate any of the HVAs in mouse cortex, despite some striking differences between HVAs in each of these factors. It would be helpful to clearly define a set of criteria that could be used for classifying distinct regions.

      We agree the revised manuscript would benefit from a clear discussion of updated rules of area delineation in the mouse. In brief, we argue that retinotopy alone should not be used to delineate area boundaries in mice, or any other species. Although there is some evidence for functional property, architecture, and connectivity changes across mouse HVAs, area boundaries continue to be defined primarily, and sometimes solely (Garrett et al., 2014; Juavinett et al., 2018; Zhuang et al., 2017), based on retinotopy. We acknowledge that earlier work (Wang and Burkhalter, 2007; Wang et al., 2011) did consider cytoarchitecture and connectivity alongside retinotopy, but more recent work has shifted to a focus on retinotopy as indicated by the currently accepted criterion for area delineation.  

      As reviewer #2 points out, the present criteria for mouse visual area delineation can be found in the Methods section of: [Garrett, M.E., Nauhaus, I., Marshel, J.H., and Callaway, E.M. (2014)].

      Criterion 1: Each area must contain the same visual field sign at all locations within the area.

      Criterion 2: Each visual area cannot have a redundant representation of visual space.

      Criterion 3: Adjacent areas of the same visual field sign must have a redundant representation.

      Criterion 4: An area's location must be consistently identifiable across experiments.

      As discussed in the manuscript, recent evidence in higher order visual cortex of tree shrews and rats led us to question the universality of these criteria across species. Specifically, tree shrew V2, macaque V2, and marmoset DM, exhibit reversals in visual field-sign in what are defined as single visual areas. This suggests that criterion 1 should be updated. It also suggests that Criterion 2 and 3 should be updated since visual field sign reversals often co-occur with retinotopic redundancies, since reversing course in the direction of progression along the visual field can easily lead to coverage of visual field regions already traveled.  

      More broadly, we argue that topography is just one of several criteria that should be considered in area delineation. We understand that few visual areas in any species meet all criteria, but we emphasize that topography cannot consistently be the sole satisfied criterion – as it currently appears to be for many mouse HVAs. Inspired by a recent perspective on cortical area delineation (Petersen et al., 2024), we suggest the following rules, that will be worked into the revised version of the manuscript. Topography is a criterion, but it comes after considerations of function, architectonics and connectivity.

      (1) Function—Cortical areas differ from neighboring areas in their functional properties  

      (2) Architectonics—Cortical areas often exhibit distinctions from neighboring areas in multiple cyto- and myeloarchitectonic markers

      (3) Connectivity—Cortical areas are characterized by a specific set of connectional inputs and outputs from and to other areas

      (4) Topography—Cortical areas often exhibit a distinct topography that balances maximal coverage of the sensory field with minimal redundancy of coverage within an area.

      As we discuss in the manuscript, although there are functional, architectonic, and connectivity differences across mouse HVAs, they typically vary smoothly across multiple areas – such that neighboring areas share the same properties and there are no sharp borders. For instance, sharp borders in cytoarchitecture are generally lacking in the mouse HVAs. A notable exceptions to this is the clear and sharp change in m2AChR expression that occurs between LM and AL (Wang et al., 2011). 

      (2) On a related note, although the authors carry out impressive analyses to show that differences in functional properties between HVAs could be explained by retinotopy, they glossed over some contrary evidence that there are functional differences independent of retinotopy. For example, axon projections to different HVAs originating from a single V1 injection - presumably including neurons with similar retinotopy - exhibit distinct functional properties (Glickfeld LL et al, Nat Neuro, 2013). As another example, interdigitated M2+/M2- patches in V1 show very different HVA connectivity and response properties, again independent of V1 location/retinotopy (Meier AM et al., bioRxiv). One consideration is that the secondary regions might be considered a single V2 with distinct functional modules based on retinotopy and connectivity (e.g., V2LM, V2PM, etc).

      Thank you for the correction. We will revise the text to discuss (Glickfeld et al., 2013), as it remains some of the strongest evidence in favor of retinotopy-independent functional specialization of mouse HVAs. However, one caveat of this study is the size of the V1 injection that is the source of axons studied in the HVAs. As apparent in Figure 1B, the large injection covers nearly a quarter of V1. It is worth nothing that (Han et al., 2018) found, using single-cell reconstructions and MAPseq, that the majority of V1 neurons project to multiple nearby HVA targets. In this experiment the tracing does not suffer from the problem of spreading over V1’s retinotopic map, and suggests that, presumably retinotopically matched, locations in each area receive shared inputs from the V1 population rather than a distinct but spatially interspersed subset. In fact, the authors conclude “Interestingly, the location of the cell body within V1 was predictive of projection target for some recipient areas (Extended Data Fig. 8). Given the retinotopic organization of V1, this suggests that visual information from different parts of visual field may be preferentially distributed to  specific target areas, which is consistent with recent findings (Zhuang et al., 2017)”. Given an injection covering a large portion of the retinotopic map, and the fact that feed-forward projections from V1 to HVAs carry coarse retinotopy - it is difficult to prove that functional specializations noted in the HVA axons are retinotopyindependent. This would require measurement of receptive field location in the axonal boutons, which the authors did not perform (possibly because the SNR of calcium indicators prevented such measurements at the time).  

      Another option would be to show that adjacent neurons in V1, that project to far-apart HVAs, exhibit distinct functional properties on par with differences exhibited by neurons in very different parts of V1 due to retinotopy. In other words, the functional specificity of V1 inputs to HVAs at retinotopically identical locations is of the same order as those that might be gained by retinotopic biases. To our knowledge, such a study has not been conducted, so we have decided to measure the data in collaboration with the Allen Institute. As part of the Allen Institute’s pioneering OpenScope project, we will make careful two-photon and electrophysiology measurements of functional properties, including receptive field location, SF, and TF in different parts of the V1 retinotopic map. Pairing this data with existing Allen Institute datasets on functional properties of neurons in the HVAs will allow us to rule in, or rule-out, our hypotheses regarding retinotopy as the source of functional specialization in mouse HVAs. We will update the discussion in the revised manuscript to better reflect the need for additional evidence to support or refute our proposal.

      Meier AM et al., bioRxiv 2025 (Meier et al., 2025) was published after our submission, but we are thankful to the reviewers for guiding our attention to this timely paper. Given the recent findings on the influence of locomotion on rodent and primate visual cortex, it is very exciting to see clearly specialized circuits for processing self-generated visual motion in V1. However, it is difficult to rule out the role of retinotopy as the HVA areas (LM, AL, RL) participating in the M2+ network less responsive to self-generated visual motion exhibit a bias for the medial portion of the visual field and the HVA area (PM) involved in the M2- network responsive to self-generated visual motion exhibit a bias for the lateral (or peripheral) parts of the visual field. For instance, a peripheral bias in area PM has been shown using retrograde tracing as in Figure 6 of (Morimoto et al., 2021), single-cell anterograde tracing  as in Extended Data Figure 8 of (Han et al., 2018), and functional imaging studies (Zhuang et al., 2017). Recent findings in the marmoset also point to visual circuits in the peripheral, but not central, visual field being significantly modulated by selfgenerated movements (Rowley et al., 2024). 

      However, a visual field bias in area PM that selectively receive M2- inputs is at odds with the clear presence of modular M2+/M2- patches across the entire map of V1 (Ji et al., 2015).  One possibility supported by existing data is that neurons in M2- patches, as well as those in M2+ patches, in the central representation of V1 make fewer or significantly weaker connections with area PM compared to areas LM, AL and RL. Evidence to the contrary would support retinotopy-independent and functionally specialized inputs from V1 to HVAs.

      (3) Some of the HVAs-such as AL, AM, and LI-appear to have redundant retinotopic coverage with other HVAS, such as LM and PM. Moreover, these regions have typically been found to have higher "hierarchy scores" based on connectivity (Harris JA et al., Nature, 2019; D'Souza RD et al., Nat Comm, 2022), though unfortunately, the hierarchy levels are not completely consistent between studies. Based on existing evidence, there is a reasonable argument to be made for a hybrid classification, in which some regions (e.g., LM, P, PM, and RL) are combined into a single V2 (though see point #2 above) while other HVAs are maintained as independent visual regions, distinct from V2. I don't expect the authors to revise their viewpoint in any way, but a more nuanced discussion of alternative classifications is warranted.

      We understand that such a proposal would combine a subset of areas with matched field sign (LM, P, PM, and RL) would be less extreme and received better by the community. This would create a V2 with a smooth map without reversals or significant redundant retinotopic coverage. However, the intuition we have built from our modeling studies suggest that both these areas, and the other smaller areas with negative field sign (AL, AM, LI), are a byproduct of a complex single map of the visual field that exhibits reversals as it contorts around the triangular and tear-shaped boundaries of V1. In other words, we believe the redundant coverage and field-sign changes/reversals are a byproduct of a single secondary visual field in V2 constrained by the cortical dimensions of V1. That being said, we understand that area delineations are in part based on a consensus by the community. Therefore we will continue to discuss our proposal with community members, and we will incorporate new evidence supporting or refuting our hypothesis, before we submit our revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The study by Rowley and Sedigh-Sarvestani presents modeling data suggesting that map reversals in mouse lateral extrastriate visual cortex do not coincide with areal borders, but instead represent borders between subregions within a single area V2. The authors propose that such an organization explains the partial coverage in higher-order areas reported by Zhuang et al., (2017). The scheme revisits an organization proposed by Kaas et al., (1989), who interpreted the multiple projection patches traced from V1 in the squirrel lateral extrastriate cortex as subregions within a single area V2. Kaas et al's interpretation was challenged by Wang and Burkhalter (2007), who used a combination of topographic mapping of V1 connections and receptive field recordings in mice. Their findings supported a different partitioning scheme in which each projection patch mapped a specific topographic location within single areas, each containing a complete representation of the visual field. The area map of mouse visual cortex by Wang and Burkhalter (2007) has been reproduced by hundreds of studies and has been widely accepted as ground truth (CCF) (Wang et al., 2020) of the layout of rodent cortex. In the meantime, topographic mappings in marmoset and tree shew visual cortex made a strong case for map reversals in lateral extrastriate cortex, which represent borders between functionally diverse subregions within a single area V2. These findings from non-rodent species raised doubts about whether during evolution, different mammalian branches have developed diverse partitioning schemes of the cerebral cortex. Rowley and Sedigh-Sarvestani favor a single master plan in which, across evolution, all mammalian species have used a similar blueprint for subdividing the cortex.

      Strengths:

      The story illustrates the enduring strength of science in search of definitive answers.

      Weaknesses:

      To me, it remains an open question whether Rowley and Sedigh-Sarvestani have written the final chapter of the saga. A key reason for my reservation is that the areas the maps used in their model are cherry-picked. The article disregards published complementary maps, which show that the entire visual field is represented in multiple areas (i.e. LM, AL) of lateral extrastriate cortex and that the map reversal between LM and AL coincides precisely with the transition in m2AChR expression and cytoarchitecture (Wang and Burkhalter, 2007; Wang et al., 2011). Evidence from experiments in rats supports the gist of the findings in the mouse visual cortex (Coogan and Burkhalter, 1993).

      We would not claim to have written the final chapter of the saga. Our goal was to add an important piece of new evidence to the discussion of area delineations across species. We believe this new evidence supports our unification hypothesis.  We also believe that there are several missing pieces of data that could support or refute our hypothesis. We have begun a collaboration to collect some of this data.  

      (1) The selective use of published evidence, such as the complete visual field representation in higher visual areas of lateral extrastriate cortex (Wang and Burkhalter, 2007; Wang et al., 2011) makes the report more of an opinion piece than an original research article that systematically analyzes the area map of mouse visual cortex we have proposed. No direct evidence is presented for a single area V2 with functionally distinct subregions.

      This brings up a nuanced issue regarding visual field coverage. Wang & Burkhalter, 2007 Figure 6 shows the receptive field of sample neurons in area LM that cover the full range between 0 and 90 degrees of azimuth, and -40 to 80 degree of elevation – which essentially matches the visual field coverage in V1. However, we do not know whether these neurons are representative of most neurons in area LM. In other words, while these single-cell recordings along selected contours in cortex show the span of the visual field coverage, they may not be able to capture crucial information about its shape, missing regions of the visual field or potential bias. To mitigate this, visual field maps measured with electrophysiology are commonly produced by even sampling across the two dimensions of the visual area, either by moving a single electrode along a grid-pattern (e.g. (Manger et al., 2002)), or using a grid-liked multi-electrode probe (e.g. (Yu et al., 2020)). This was not carried out either in Wang & Burkhalter 2007 or Wang et al. 2011.  Even sampling of cortical space is time consuming and difficult with electrophysiology, but efficient with functional imaging. Therefore, despite the likely under-estimation of visual field coverage, imaging techniques are valuable in that they can efficiently exhibit not only the span of the visual field of a cortical region, but also its shape and bias.  

      Multiple functional imaging studies that simultaneously measure visual field coverage in V1 and HVAs report a bias in the coverage of HVAs, relative to that in V1 (Garrett et al., 2014; Juavinett et al., 2018; Zhuang et al., 2017). While functional imaging will likely underestimate receptive fields compared to electrophysiology, the consistent observation of an orderly bias for distinct parts of the visual field across the HVAs suggests that at least some of the HVAs do not have full and uniform coverage of the visual field comparable to that in V1. For instance, (Garrett et al., 2014) show that the total coverage in HVAs, when compared to V1, is typically less than half (Figure 6D) and often irregularly shaped.

      Careful measurements of single-cell receptive fields, using mesoscopic two-photon imaging across the HVAs would settle this question. As reviewer #1 points out, this is technically feasible, though no dataset of this kind exists to our knowledge.

      (2) The article misrepresents evidence by commenting that m2AChR expression is mainly associated with the lower field. This is counter to published findings showing that m2AChR spans across the entire visual field (Gamanut et al., 2018; Meier et al., 2021). The utility of markers for delineating areal boundaries is discounted, without any evidence, in disregard of evidence for distinct areal patterns in early development (Wang et al., 2011). Pointing out that markers can be distributed non-uniformly within an area is well-familiar. m2AChR is non-uniformly expressed in mouse V1, LM and LI (Ji et al., 2015; D'Souza et al., 2019; Meier et al., 2021). Recently, it has been found that the patchy organization within V1 plays a role in the organization of thalamocortical and intracortical networks (Meier et al., 2025). m2AChR-positive patches and m2AChR-negative interpatches organize the functionally distinct ventral and dorsal networks, notably without obvious bias for upper and lower parts of the visual field.

      We wrote that “Future work showed boundaries in labeling of histological markers such as SMI-32 and m2ChR labeling, but such changes mostly delineated area LM/AL (Wang et al., 2011) and seemed to be correlated with the representation of the lower visual field.” The latter statement regarding the representation of the lower visual field is directly referencing the data in Figure 1 of (Wang et al., 2011), which is titled “Figure 1: LM/AL border identified by the transition of m2AChR expression coincides with receptive field recordings from lower visual field.” Similar to the Wang et al., we were simply referring to the fact that the border of area LM/AL co-exhibits a change in m2AChR expression as well as lower-visual field representation.  

      (3) The study has adopted an area partitioning scheme, which is said to be based on anatomically defined boundaries of V2 (Zhuang et al., 2017). The only anatomical borders used by Zhuang et al. (2017) are those of V1 and barrel cortex, identified by cytochrome oxidase staining. In reality, the partitioning of the visual cortex was based on field sign maps, which are reproduced from Zhuang et al., (2017) in Figure 1A. It is unclear why the maps shown in Figures 2E and 2F differ from those in Figure 1A. It is possible that this is an oversight. But maintaining consistent areal boundaries across experimental conditions that are referenced to the underlying brain structure is critical for assigning modeled projections to areas or sub-regions. This problem is evident in Figure 2F, which is presented as evidence that the modeling approach recapitulates the tracings shown in Figure 3 of Wang and Burkhalter (2007). The dissimilarities between the modeling and tracing results are striking, unlike what is stated in the legend of Figure 2F.

      Thanks for this correction. By “anatomical boundaries of higher visual cortex”, we meant the cortical boundary between V1 and higher order visual areas on one end, and the outer edge of the envelope that defines the functional boundaries of the HVAs in cortical space (Zhuang et al., 2017). The reviewer is correct that we should have referred to these as functional boundaries. The word ‘anatomical’ was meant to refer to cortical space, rather than visual field space.

      More generally though, there is no disagreement between the partitioning of visual cortex in Figure 1 and 2. Rather, the portioning in Figure 1 is directly taken from Zhuang et al., (2017) whereas those in Figure 2 are produced by mathematical model simulation. As such, one would not expect identical areal boundaries between Figure 2 and Figure 1. What we aimed to communicate with our modeling results, is that a single area can exhibit multiple visual field reversals and retinotopic redundancies if it is constrained to fit around V1 and cover a visual field approximately matched to the visual field coverage in V1. We defined this area explicitly as a single area with a single visual field (boundaries shown in Figure 2A). So  the point of our simulation is to show that even an explicitly defined single area can appear as multiple areas if it is constrained by the shape of mouse V1, and if visual field reversals are used to indicate areal boundaries. As in most models, different initial conditions and parameters produce a complex visual field which will appear as multiple HVAs when delineated by areal boundaries. What is consistent however, is the existence of complex single visual field that appears as multiple HVAs with partially overlapping coverage.

      Similarly, we would not expect a simple model to exactly reproduce the multi-color tracer injections in Wang and Burkhalter (2007). However, we find it quite compelling that the model can produce multiple groups of multi-colored axonal projections beyond V1 that can appear as multiple areas each with their own map of the visual field using current criteria, when the model is explicitly designed to map a single visual field. We will explain the results of the model, and their implications, better in the revised manuscript.

      (4) The Rowley and Sedigh-Sarvestani find that the partial coverage of the visual field in higher order areas shown by Zhuang et al (2017) is recreated by the model. It is important to caution that Zhuang et al's (2017) maps were derived from incomplete mappings of the visual field, which was confined to -25-35 deg of elevation. This underestimates the coverage we have found in LM and AL. Receptive field mappings show that LM covers 0-90 deg of azimuth and -30-80 elevation (Wang and Burkhalter, 2007). AL covers at least 0-90 deg of azimuth and -30-50 deg of elevation (Wang and Burkhalter, 2007; Wang et al., 2011). These are important differences. Partial coverage in LM and AL underestimates the size of these areas and may map two projection patches as inputs to subregions of a single area rather than inputs to two separate areas. Complete, or nearly complete, visual representations in LM and AL support that each is a single area. Importantly, both areas are included in a callosal-free zone (Wang and Burkhalter, 2007). The surrounding callosal connections align with the vertical meridian representation. The single map reversal is marked by a transition in m2AChR expression and cytoarchitecture (Wang et al., 2011).

      This is a good point. We do not expect that expanding the coverage of V1 will change the results of the model significantly. However, for the revised manuscript, we will update V1 coverage to be accurate, repeat our simulations, and report the results.  

      (5) The statement that the "lack of visual field overlap across areas is suggestive of a lack of hierarchical processing" is predicated on the full acceptance of the mappings by Zhuang et al (2017). Based on the evidence reviewed above, the reclassification of visual areas proposed in Figure 1C seems premature.

      The reviewer is correct. In the revised manuscript, we will be careful to distinguish bias in visual field coverage across areas from presence or lack of visual field overlap.  

      (6) The existence of lateral connections is not unique to rodent cortex and has been described in primates (Felleman and Van Essen, 1991).

      (7) Why the mouse and rat extrastriate visual cortex differ from those of many other mammals is unclear. One reason may be that mammals with V2 subregions are strongly binocular.

      This is an interesting suggestion, and careful visual topography data from rabbits and other lateral eyed animals would help to evaluate it. For what it’s worth, tree shrews are lateral eyed animals with only 50 degrees of binocular visual field and also show V2 subregions.

      Reviewer #3 (Public review):

      Summary:

      The authors review published literature and propose that a visual cortical region in the mouse that is widely considered to contain multiple visual areas should be considered a single visual area.

      Strengths:

      The authors point out that relatively new data showing reversals of visual-field sign within known, single visual areas of some species require that a visual field sign change by itself should not be considered evidence for a border between visual areas.

      Weaknesses:

      The existing data are not consistent with the authors' proposal to consolidate multiple mouse areas into a single "V2". This is because the existing definition of a single area is that it cannot have redundant representations of the visual field. The authors ignore this requirement, as well as the data and definitions found in published manuscripts, and make an inaccurate claim that "higher order visual areas in the mouse do not have overlapping representations of the visual field". For quantification of the extent of overlap of representations between 11 mouse visual areas, see Figure 6G of Garrett et al. 2014. [Garrett, M.E., Nauhaus, I., Marshel, J.H., and Callaway, E.M. (2014). Topography and areal organization of mouse visual cortex. The Journal of neuroscience 34, 12587-12600. 10.1523/JNEUROSCI.1124-14.2014.

      Thank you for this correction, we admit we should have chosen our words more carefully. In the revised manuscript, we will emphasize that higher order visual areas in the mouse do have some overlap in their representations but also exhibit bias in their coverage. This is consistent with our proposal and in fact our model simulations in Figure 2E also show overlapping representations along with differential bias in coverage. However, we also note Figure 6 of Garret et al. 2014 provides several pieces of evidence in support of our proposal that higher order areas are sub-regions of a single area V2. Specifically, the visual field coverage of each area is significantly less than that in V1 (Garret et al. 2014, Figure 6D). While the imaging methods used in Garret et al. likely under-estimate receptive fields, one would assume they would similarly impact measurements of coverage in V1 and HVAs. Secondly, each area exhibits a bias towards a different part of the visual field (Figure 6C and E), that this bias is distinct for different areas but proceeds in a retinotopic manner around V1 - with adjacent areas exhibiting biases for nearby regions of the visual field (Figure 6E). Thus, the biases in the visual field coverage across HVAs appear to be related and not independent of each other. As we show in our modeling and in Figure 2, such orderly and inter-related biases can be created from a single visual field constrained to share a border with mouse V1.   

      With regards to the existing definition of a single area: we did not ignore the requirement that single areas cannot have redundant representations of the visual field. Rather, we believe that this requirement should be relaxed considering new evidence collected from other species, where multiple visual field reversals exist within the same visual area. We understand this issue is nuanced and was not made clear in the original submission.  

      In the revised manuscript, we will clarify that visual field reversals often exhibit redundant retinotopic representation on either side of the reversal. In the revised manuscript we will clarify that our argument that multiple reversals can exist within a single visual area in the mouse, is an argument that some retinotopic redundancy can exist with single visual areas. Such a re-classification would align how we define visual areas in mice with existing classification in tree shrews, ferrets, cats, and primates – all of whom have secondary visual areas with complex retinotopic maps exhibiting multiple reversals and redundant retinotopic coverage.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their careful assessment and enthusiastic appreciation of our work.

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __In this article, Thomas et al. use a super-resolution approach in living cells to track proteins involved in the fusion event of sexual reproduction. They study the spatial organization and dynamics of the actin fusion focus, a key structure in cell-cell fusion in Schizosaccharomyces pombe. The researchers have adapted a high-precision centroid mapping method using three-color live-cell epifluorescence imaging to map the dynamic architecture of the fusion focus during yeast mating. The approach relies on tracking the centroid of fluorescence signals for proteins of interest, spatially referenced to Myo52-mScarlet-I (as a robust marker) and temporally referenced using a weakly fluorescent cytosolic protein (mRaspberry), which redistributes strongly upon fusion. The trajectories of five key proteins, including markers of polarity, cytoskeleton, exocytosis and membrane fusion, were compared to Myo52 over a 75-minute window spanning fusion. Their observations indicate that secretory vesicles maintain a constant distance from the plasma membrane whereas the actin network compacts. Most importantly, they discovered a positive feedback mechanism in which myosin V (Myo52) transports Fus1 formin along pre-existing actin filaments, thereby enhancing aster compaction.

      This article is well written, the arguments are convincing and the assertions are balanced. The centroid tracking method has been clearly and solidly controlled. Overall, this is a solid addition to our understanding of cytoskeletal organization in cell fusion.

      Major comments: No major comment.

      Minor comments: _ Page 8 authors wrote "Upon depletion of Myo52, Ypt3 did not accumulate at the fusion focus (Figure 3C). A thin, wide localization at the fusion site was occasionally observed (Figure 3C, Movies S3)" : Is there a quantification of this accumulation in the mutant?

      We will provide the requested quantification. The localization is very faint, so we are not sure that quantification will capture this faithfully, but we will try.

      _ The framerate of movies could be improved for reader comfort: For example, movie S6 lasts 0.5 sec.

      We agree that movies S3 and S6 frame rates could be improved. We will provide them with slower frame rate.

      Reviewer #1 (Significance (Required)):

      This study represents a conceptual and technical breakthrough in our understanding of cytoskeletal organization during cell-cell fusion. The authors introduce a high-precision, three-color live-cell centroid mapping method capable of resolving the spatio-temporal dynamics of protein complexes at the nanometer scale in living yeast cells. This methodological innovation enables systematic and quantitative mapping of the dynamic architecture of proteins at the cell fusion site, making it a powerful live-cell imaging approach. However, it is important to keep in mind that the increased precision achieved through averaging comes at the expense of overlooking atypical or outlier behaviors. The authors discovered a myosin V-dependent mechanism for the recruitment of formin that leads to actin aster compaction. The identification of Myo52 (myosin V) as a transporter of Fus1 (formin) to the fusion focus adds a new layer to our understanding of how polarized actin structures are generated and maintained during developmentally regulated processes such as mating.

      Previous studies have shown the importance of formins and myosins during fusion, but this paper provides a quantitative and dynamic mapping that demonstrates how Myo52 modulates Fus1 positioning in living cells. This provides a better understanding of actin organization, beyond what has been demonstrated by fixed-cell imaging or genetic perturbation.

      Audience: Cell biologists working on actin dynamics, cell-cell fusion and intracellular transport. Scientists involved in live-cell imaging, single particle tracking and cytoskeleton modeling.

      I have expertise in live-cell microscopy, image analysis, fungal growth machinery and actin organization.

      We thank the reviewer for their appreciation of our work.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ A three-color imaging approach to use centroid tracking is employed to determine the high resolution position over time of tagged actin fusion focus proteins during mating in fission yeast. In particular, the position of different protein components (tagged in a 3rd color) were determined in relation to the position (and axis) of the molecular motor Myo52, which is tagged with two different colors in the mating cells. Furthermore, time is normalized by the rapid diffusion of a weak fluorescent protein probe (mRaspberry) from one cell to the other upon fusion pore opening. From this approach multiple important mechanistic insights were determined for the compaction of fusion focus proteins during mating, including the general compaction of different components as fusion proceeds with different proteins having specific stereotypical behaviors that indicate underlying molecular insights. For example, secretory vesicles remain a constant distance from the plasma membrane, whereas the formin Fus1 rapidly accumulates at the fusion focus in a Myo52-dependent manner.

      I have minor suggestions/points: (1) Figure 1, for clarity it would be helpful if the cells shown in B were in the same orientation as the cartoon cells shown in A. Similarly, it would be helpful to have the orientation shown in D the same as the data that is subsequently presented in the rest of the manuscript (such as Figure 2) where time is on the X axis and distance (position) is on the Y axis.

      We have turned each image in panel B by 180° to match the cartoon in A. For panel D, we are not sure what the reviewer would like. This panel shows the coordinates of each Myo52 position, whereas Figure 2 shows oriented distance (on the Y axis) over time (on the X axis). Perhaps the reviewer suggests that we should display panel D with a rotation onto the Y axis rather than the X axis. We feel that this would not bring more clarity and prefer to keep it as is.

      (2) Figure 2, for clarity useful to introduce how the position of Myo52 changes over time with respect to the fusion site (plasma membrane) earlier, and then come back to the positions of different proteins with respect to Myo52 shown in 2E. Currently the authors discuss this point after introducing Figure 2E, but better for the reader to have this in mind beforehand.

      We have added a sentence at the start of the section describing Figure 2, pointing out that the static appearance of Myo52 is due to it being used as reference, but that in reality, it moves relative to the plasma membrane: “Because Myo52 is the reference, its trace is flat, even though in reality Myo52 also moves relative to other proteins and the plasma membrane (see Figure 2E)”. This change is already in the text.

      (3) First sentence of page 8 "..., peaked at fusion time and sharply dropped post-fusion (Figure S3)." Figure S3 should be cited so that the reader knows where this data is presented.

      Thanks, we have added the missing figure reference to the text.

      (4) Figure 3D-H, why is Exo70 used as a marker for vesicles instead of Ypt3 for these experiments? Exo70 seems to have a more confusing localization than Ypt3 (3C vs 3D), which seems to complicate interpretations.

      There are two main reasons for this choice. First, the GFP-Ypt3 fluorescence intensity is lower than that of Exo70-GFP, which makes analysis more difficult and less reliable. Second, in contrast to Exo70-GFP where the endogenous gene is tagged at the native genomic locus, GFP-Ypt3 is expressed as additional copy in addition to endogenous untagged Ypt3. Although GFP-Ypt3 was reported to be fully functional as it can complement the lethality of a ypt3 temperature sensitive mutant (Cheng et al, MBoC 2002), its expression levels are non-native and we do not have a strain in which ypt3 is tagged at the 5’ end at the native genomic locus. For these reasons, we preferred to examine in detail the localization of Exo70. We do not think it complicates interpretations. Exo70 faithfully decorates vesicles and exhibits the same localization as Ypt3 in WT cells (see Figure 2D) and in myo52-AID (see Figure 3C-D). We realize that our text was a bit confusing as we opposed the localization of Exo70 and Ypt3, when all we wanted to state was that the Exo70-GFP signal is stronger. We have corrected this in the text.

      (5) Page 10, end of first paragraph, "We conclude...and promotes separation of Myo52 from the vesicles." This is an interesting hypothesis/interpretation that is consistent with the spatial-temporal organization of vesicles and the compacting fusion focus, but the underlying molecular mechanism has not be concluded.

      This is an interpretation that is in line with our data. Firm conclusion that the organization of the actin fusion focus imposes a steric barrier to bulk vesicle entry will require in vitro reconstitution of an actin aster driven by formin-myosin V feedback and addition of myosin V vesicle-like cargo, which can be a target for future studies. To make clear that it is an interpretation and not a definitive statement, we have added “likely” to the sentence, as in: “We conclude that the distal position of vesicles in WT cells is a likely steric consequence of the architecture of the fusion focus, which restricts space at the center of the actin aster and promotes separation of Myo52 from the vesicles”.

      (6) Figure 5F and 5G, the results are confusing and should be discussed further. Depletion of Myo52 decreases Fus1 long-range movements, indicating that Fus1 is being transported by Myo52 (5F). Similarly, the Fus1 actin assembly mutant greatly decreases Fus1 long-range movements and prevents Myo52 binding (5G), perhaps indicating that Fus1-mediated actin assembly is important. It seems the author's interpretations are oversimplified.

      We show that Myo52 is critical for Fus1 long-range movements, as stated by the reviewer. We also show that Fus1-mediated actin assembly is important. The question is in what way.

      One possibility is that FH2-mediated actin assembly powers the movement, which in this case represents the displacement of the formin due to actin monomer addition on the polymerizing filament. A second possibility is that actin filaments assembled by Fus1 somehow help Myo52 move Fus1. This could be for instance because Fus1-assembled actin filaments are preferred tracks for Myo52-mediated movements, or because they allow Myo52 to accumulate in the vicinity of Fus1, enhancing their chance encounter and thus the number of long-range movements (on any actin track). Based on the analysis of the K1112A point mutant in Fus1 FH2 domain, our data cannot discriminate between these three different options, which is why we concluded that the mutant allele does not allow us to make a firm conclusion. However, the Myo52-dependence clearly shows that a large fraction of the movements requires the myosin V. We have clarified the end of the paragraph in the following way: “Therefore, analysis of the K1112A mutant phenotype does not allow us to clearly distinguish between Fus1-powered from Myo52-powered movements. Future work will be required to test whether, in addition to myosin V-dependent transport, Fus1-mediated actin polymerization also directly contributes to Fus1 long-range movements.”

      (7) Figure 6, why not measure the fluorescence intensity of Fus1 as a proxy for the number of Fus1 molecules (rather than the width of the Fus1 signal), which seems to be the more straight-forward analysis?

      The aim of the measurement was to test whether Myo52 and Fus1 activity help focalize the formin at the fusion site, not whether these are required for localization in this region. This is why we are measuring the lateral spread of the signal (its width) rather than the fluorescence intensity of the signal. We know from previous work that Fus1 localizes to the shmoo tip independently of myosin V (Dudin et al, JCB 2015), and we also show this in Figure 6. However, the precise distribution of Fus1 is wider in absence of the myosins.

      We can and will measure intensities to test whether there is also a quantitative difference in the number of molecules at the shmoo tip.

      (8) Figure 7, the authors should note (and perhaps discuss) any evidence as to whether activation of Fus1 to facilitate actin assembly depends upon Fus1 dissociating from Myo52 or whether Fus1 can be activated while still associated with Myo52, as both circumstances are included in the figure.

      This is an interesting point. We have no experimental evidence for or against Fus1 dissociating from Myo52 to assemble actin. However, it is known that formins rotate along the actin filament double helix as they assemble it, a movement that seems poorly compatible with processive transport by myosin V. In Figure 7, we do not particularly want to imply that Myo52 associates with Fus1 linked or not with an actin filament. The figure serves to illustrate the focusing mechanism of myosin V transporting a formin, which is more evident when we draw the formin attached to a filament end. We have now added a sentence in the figure legend to clarify this point: “Note that it is unknown whether Myo52 transports Fus1 associated or not with an actin filament.”

      (9) Figure 7, the color of secretory vesicles should be the same in A and B.

      This is now corrected.

      Reviewer #2 (Significance (Required)):

      This is an impactful and high quality manuscript that describes an elegant experimental strategy with important insights determined. The experimental imaging strategy (and analysis), as well as the insight into the pombe mating fusion focus and its comparison to other cytoskeletal compaction events will be of broad scientific interest.

      We thank the reviewer for their appreciation of our work.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Fission yeast cell-cell fusion during mating is mediated by an actin-based structure called the 'fusion focus', which orchestrates actin polymerization by the mating-specific formin, Fus1, to direct polarized secretion towards the mating site. In the current study, Thomas and colleagues quantitatively map the spatial distribution of proteins mediating cell-cell fusion using a three-color fluorescence imaging methodology in the fission yeast Schizosaccharomyces pombe. Using Myo52 (Type V myosin) as a fluorescence reference point, the authors discover that proteins known to localize to the fusion focus have distinct spatial distributions and accumulation profiles at the mating site. Myo52 and Fus1 form a complex in vivo detected by co-immunoprecipitation and each contribute to directing secretory vesicles to the fusion focus. Previous work from this group has shown that the intrinsically disordered region (IDR) of Fus1 plays a critical role in forming the fusion focus. Here, the authors swap out the IDR of fission yeast Fus1 for the IDR of an unrelated mammalian protein, coincidentally called 'fused in sarcoma' (FUS). They express the Fus1∆IDR-FUSLC-27R chimera in mitotically dividing fission yeast cells, where Fus1 is not normally expressed, and discover that the Fus1∆IDR-FUSLC-27R chimera can travel with Myo52 on actively polymerizing actin cables. Additionally, they show that acute loss of Myo52 or Fus1 function, using Auxin-Inducible Degradation (AID) tags and point mutations, impair the normal compaction of the fusion focus, suggesting that direct interaction and coordination of Fus1 and Myo52 helps shape this structure.

      Major Comments:

      (1) In the Results section for Figure 2, the authors claim that actin filaments become shorter and more cross-linked they move away from the fusion site during mating, and suggest that this may be due to the presence of Myo51. However, the evidence to support this claim is not made clear. Is it supported by high-resolution electron microscopy of the actin filaments, or some other results? This needs to be clarified.

      Sorry if our text was unclear. The basis for the claim that actin filaments become shorter comes from our observation that the average position of tropomyosin and Myo51, both of which decorate actin filaments, is progressively closer to both Fus1 and the plasma membrane. Thus, the actin structure protrudes less into the cytosol as fusion progresses. The basis for claiming that Myo51 promotes actin filament crosslinking comes mainly from previously published papers, which had shown that 1) Myo51 forms complexes with the Rng8 and Rng9 proteins (Wang et al, JCB 2014), and 2) the Myo51-Rng8/9 not only binds actin through Myo51 head domain but also binds tropomyosin-decorated actin through the Rng8/9 moiety (Tang et al, JCB 2016; reference 27 in our manuscript). We had also previously shown that these proteins are necessary for compaction of the fusion focus (Dudin et al, PLoS Genetics 2017; reference 28 in our manuscript). Except for measuring the width of Fus1 distribution in myo51∆ mutants, which confirms previous findings, we did not re-investigate here the function of Myo51.

      We have now re-written this paragraph to present the previous data more clearly: “The distal localization of Myo51 was mirrored by that of tropomyosin Cdc8, which decorates linear actin filaments (Figure 2B) (Hatano et al, 2022). The distal position of the bulk of Myo51-decorated actin filaments was confirmed using Airyscan super-resolution microscopy (Figure 2B, right). Thus, the average position of actin filaments and decreasing distance to Myo52 indicates they initially extend a few hundred nanometers into the cytosol and become progressively shorter as fusion proceeds. Previous work had shown that Myo51 cross-links and slides Cdc8-decorated actin filaments relative to each other (Tang et al, 2016) and that both proteins contribute to compaction of the fusion focus in the lateral dimension along the cell-cell contact area (perpendicular to the fusion axis) (Dudin et al, 2017). We confirmed this function by measuring the lateral distribution of Fus1 along the cell-cell contact area (perpendicular to the fusion axis), which was indeed wider in myo51∆ than WT cells (see below Figure 6A-B).”

      (2) In Figure 4, the authors comment that disrupting Fus1 results in more disperse Myo52 spatial distribution at the fusion focus, raising the possibility that Myo52 normally becomes focused by moving on the actin filaments assembled by Fus1. This can be tested by asking whether latrunculin treatment phenocopies the 'more dispersed' Myo52 localization seen in fus1∆ cells? If Myo52 is focused instead by its direct interaction with Fus1, the latrunculin treatment should not cause the same phenotype.

      This is in principle a good idea, though it is technically challenging because pharmacological treatment of cell pairs in fusion is difficult to do without disturbing pheromone gradients which are critical throughout the fusion process (see Dudin et al, Genes and Dev 2016). We will try the experiment but are unsure about the likelihood of technical success.

      We note however that a similar experiment was done previously on Fus1 overexpressed in mitotic cells (Billault-Chaumartin et al, Curr Biol 2022; Fig 1D). Here, Fus1 also forms a focus and latrunculin A treatment leads to Myo52 dispersion while keeping the Fus1 focus, which is in line with our proposal that Myo52 becomes focused by moving on Fus1-assembled actin filaments. Similarly, we showed in Figure 5B that Latrunculin A treatment of mitotic cells expressing Fus1∆IDR-FUSLC-27R also results in Myo52, but not Fus1 dispersion.

      (3) The Fus1∆IDR-FUSLC-27R chimera used in Figure 5 is an interesting construct to examine actin-based transport of formins in cells. I was curious if the authors could provide the rates of movement for Myo52 and for Fus1∆IDR-FUSLC-27R, both before and after acute depletion of Myo52. It would be interesting to see if loss of Myo52 alters the rate of movement, or instead the movement stems from formin-mediated actin polymerization.

      We will measure these rates.

      (4) Also, Myo52 is known to interact with the mitotic formin For3. Does For3 colocalize with Myo52 and Fus1∆IDR-FUSLC-27R along actin cables?

      This is an interesting question for which we do not have an answer. For technical reasons, we do not have the tools to co-image For3 with Fus1∆IDR-FUSLC-27R because both are tagged with GFP. We feel that this question goes beyond the scope of this paper.

      (5) If Fus1∆IDR-FUSLC-27R is active, does having ectopic formin activity in mitotic cells affect actin cable architecture? This could be assessed by comparing phalloidin staining for wildtype and Fus1∆IDR-FUSLC-27R cells.

      We are not sure what the purpose of this experiment is, or how informative it would be. If it is to evaluate whether Fus1∆IDR-FUSLC-27R is active, our current data already demonstrates this. Indeed, Fus1∆IDR-FUSLC-27R recruits Myo52 in a F-actin and FH2 domain-dependent manner (shown in Figure 5B and 5G), which demonstrates that Fus1∆IDR-FUSLC-27R FH2 domain is active. Even though Fus1∆IDR-FUSLC-27R assembles actin, we predict that its effect on general actin organization will be weak. Indeed, it is expressed under endogenous fus1 promoter, leading to very low expression levels during mitotic growth, such that only a subset of cells exhibit a Fus1 focus. Furthermore, most of these Fus1 foci are at or close to cell poles, where linear actin cables are assembled by For3, such that they may not have a strong disturbing effect. Because analysis of actin cable organization by phalloidin staining is difficult (due to the more strongly staining actin patches), cells with clear change in organization predicted to be rare in the population, and the gain in knowledge not transformative, we are not keen to do this experiment.

      Minor Comments:

      Prior studies are referenced appropriately. Text and figures are clear and accurate. My only suggestion would be Figure 1E-H could be moved to the supplemental material, due to their extremely technical nature. I believe this would help the broad audience focus on the experimental design mapped out in Figure 1A-D.

      We are relatively neutral about this. If this suggestion is supported by the Editor, we can move these panels to supplement.

      Reviewer #3 (Significance (Required)):

      Significance: This study provides an improved imaging method for detecting the spatial distributions of proteins below 100 nm, providing new insights about how a relatively small cellular structure is organized. The use of three-color cell imaging to accurately measure accumulation rates of molecular components of the fusion focus provides new insight into the development of this structure and its roles in mating. This method could be applied to other multi-protein structures found in different cell types. This work uses rigorously genetic tools such as knockout, knockdown and point mutants to dissect the roles of the formin Fus1 and Type V myosin Myo52 in creating a proper fusion focus. The study could be improved by biochemical assays to test whether Myo52 and Fus1 directly interact, since the interaction is only shown by co-immunoprecipitation from extracts, which may reflect an indirect interaction.

      Indeed, future studies should dissect the Fus1-Myo52 interaction, to determine whether it is direct and identify mutants that impair it.

      I believe this work advances the cell-mating field by providing others with a spatial and temporal map of conserved factors arriving to the mating site. Additionally, they identified a way to study a mating specific protein in mitotically dividing cells, offering future questions to address.

      This study should appeal to a range of basic scientists interested in cell biology, the cytoskeleton, and model organisms. The three-colored quantitative imaging could be applied to defining the architecture of many other cellular structures in different systems. Myosin and actin scientists will be interested in how this work expands the interplay of these two fields.

      I am a cell biologist with expertise in live cell imaging, genetics and biochemistry.

      We thank the reviewer for their appreciation of our work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Fission yeast cell-cell fusion during mating is mediated by an actin-based structure called the 'fusion focus', which orchestrates actin polymerization by the mating-specific formin, Fus1, to direct polarized secretion towards the mating site. In the current study, Thomas and colleagues quantitatively map the spatial distribution of proteins mediating cell-cell fusion using a three-color fluorescence imaging methodology in the fission yeast Schizosaccharomyces pombe. Using Myo52 (Type V myosin) as a fluorescence reference point, the authors discover that proteins known to localize to the fusion focus have distinct spatial distributions and accumulation profiles at the mating site. Myo52 and Fus1 form a complex in vivo detected by co-immunoprecipitation and each contribute to directing secretory vesicles to the fusion focus. Previous work from this group has shown that the intrinsically disordered region (IDR) of Fus1 plays a critical role in forming the fusion focus. Here, the authors swap out the IDR of fission yeast Fus1 for the IDR of an unrelated mammalian protein, coincidentally called 'fused in sarcoma' (FUS). They express the Fus1∆IDR-FUSLC-27R chimera in mitotically dividing fission yeast cells, where Fus1 is not normally expressed, and discover that the Fus1∆IDR-FUSLC-27R chimera can travel with Myo52 on actively polymerizing actin cables. Additionally, they show that acute loss of Myo52 or Fus1 function, using Auxin-Inducible Degradation (AID) tags and point mutations, impair the normal compaction of the fusion focus, suggesting that direct interaction and coordination of Fus1 and Myo52 helps shape this structure.

      Major Comments:

      • In the Results section for Figure 2, the authors claim that actin filaments become shorter and more cross-linked they move away from the fusion site during mating, and suggest that this may be due to the presence of Myo51. However, the evidence to support this claim is not made clear. Is it supported by high-resolution electron microscopy of the actin filaments, or some other results? This needs to be clarified.

      • In Figure 4, the authors comment that disrupting Fus1 results in more disperse Myo52 spatial distribution at the fusion focus, raising the possibility that Myo52 normally becomes focused by moving on the actin filaments assembled by Fus1. This can be tested by asking whether latrunculin treatment phenocopies the 'more dispersed' Myo52 localization seen in fus1∆ cells? If Myo52 is focused instead by its direct interaction with Fus1, the latrunculin treatment should not cause the same phenotype.

      • The Fus1∆IDR-FUSLC-27R chimera used in Figure 5 is an interesting construct to examine actin-based transport of formins in cells. I was curious if the authors could provide the rates of movement for Myo52 and for Fus1∆IDR-FUSLC-27R, both before and after acute depletion of Myo52. It would be interesting to see if loss of Myo52 alters the rate of movement, or instead the movement stems from formin-mediated actin polymerization.

      • Also, Myo52 is known to interact with the mitotic formin For3. Does For3 colocalize with Myo52 and Fus1∆IDR-FUSLC-27R along actin cables?

      • If Fus1∆IDR-FUSLC-27R is active, does having ectopic formin activity in mitotic cells affect actin cable architecture? This could be assessed by comparing phalloidin staining for wildtype and Fus1∆IDR-FUSLC-27R cells.

      Minor Comments:

      • Prior studies are referenced appropriately.

      • Text and figures are clear and accurate. My only suggestion would be Figure 1E-H could be moved to the supplemental material, due to their extremely technical nature. I believe this would help the broad audience focus on the experimental design mapped out in Figure 1A-D.

      Significance

      Significance: This study provides an improved imaging method for detecting the spatial distributions of proteins below 100 nm, providing new insights about how a relatively small cellular structure is organized. The use of three-color cell imaging to accurately measure accumulation rates of molecular components of the fusion focus provides new insight into the development of this structure and its roles in mating. This method could be applied to other multi-protein structures found in different cell types. This work uses rigorously genetic tools such as knockout, knockdown and point mutants to dissect the roles of the formin Fus1 and Type V myosin Myo52 in creating a proper fusion focus. The study could be improved by biochemical assays to test whether Myo52 and Fus1 directly interact, since the interaction is only shown by co-immunoprecipitation from extracts, which may reflect an indirect interaction.

      I believe this work advances the cell-mating field by providing others with a spatial and temporal map of conserved factors arriving to the mating site. Additionally, they identified a way to study a mating specific protein in mitotically dividing cells, offering future questions to address.

      This study should appeal to a range of basic scientists interested in cell biology, the cytoskeleton, and model organisms. The three-colored quantitative imaging could be applied to defining the architecture of many other cellular structures in different systems. Myosin and actin scientists will be interested in how this work expands the interplay of these two fields.

      I am a cell biologist with expertise in live cell imaging, genetics and biochemistry.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      A three-color imaging approach to use centroid tracking is employed to determine the high resolution position over time of tagged actin fusion focus proteins during mating in fission yeast. In particular, the position of different protein components (tagged in a 3rd color) were determined in relation to the position (and axis) of the molecular motor Myo52, which is tagged with two different colors in the mating cells. Furthermore, time is normalized by the rapid diffusion of a weak fluorescent protein probe (mRaspberry) from one cell to the other upon fusion pore opening. From this approach multiple important mechanistic insights were determined for the compaction of fusion focus proteins during mating, including the general compaction of different components as fusion proceeds with different proteins having specific stereotypical behaviors that indicate underlying molecular insights. For example, secretory vesicles remain a constant distance from the plasma membrane, whereas the formin Fus1 rapidly accumulates at the fusion focus in a Myo52-dependent manner.

      I have minor suggestions/points:

      (1) Figure 1, for clarity it would be helpful if the cells shown in B were in the same orientation as the cartoon cells shown in A. Similarly, it would be helpful to have the orientation shown in D the same as the data that is subsequently presented in the rest of the manuscript (such as Figure 2) where time is on the X axis and distance (position) is on the Y axis.

      (2) Figure 2, for clarity useful to introduce how the position of Myo52 changes over time with respect to the fusion site (plasma membrane) earlier, and then come back to the positions of different proteins with respect to Myo52 shown in 2E. Currently the authors discuss this point after introducing Figure 2E, but better for the reader to have this in mind beforehand.

      (3) First sentence of page 8 "..., peaked at fusion time and sharply dropped post-fusion (Figure S3)." Figure S3 should be cited so that the reader knows where this data is presented.

      (4) Figure 3D-H, why is Exo70 used as a marker for vesicles instead of Ypt3 for these experiments? Exo70 seems to have a more confusing localization than Ypt3 (3C vs 3D), which seems to complicate interpretations.

      (5) Page 10, end of first paragraph, "We conclude...and promotes separation of Myo52 from the vesicles." This is an interesting hypothesis/interpretation that is consistent with the spatial-temporal organization of vesicles and the compacting fusion focus, but the underlying molecular mechanism has not be concluded.

      (6) Figure 5F and 5G, the results are confusing and should be discussed further. Depletion of Myo52 decreases Fus1 long-range movements, indicating that Fus1 is being transported by Myo52 (5F). Similarly, the Fus1 actin assembly mutant greatly decreases Fus1 long-range movements and prevents Myo52 binding (5G), perhaps indicating that Fus1-mediated actin assembly is important. It seems the author's interpretations are oversimplified.

      (7) Figure 6, why not measure the fluorescence intensity of Fus1 as a proxy for the number of Fus1 molecules (rather than the width of the Fus1 signal), which seems to be the more straight-forward analysis?

      (8) Figure 7, the authors should note (and perhaps discuss) any evidence as to whether activation of Fus1 to facilitate actin assembly depends upon Fus1 dissociating from Myo52 or whether Fus1 can be activated while still associated with Myo52, as both circumstances are included in the figure.

      (9) Figure 7, the color of secretory vesicles should be the same in A and B.

      Significance

      This is an impactful and high quality manuscript that describes an elegant experimental strategy with important insights determined. The experimental imaging strategy (and analysis), as well as the insight into the pombe mating fusion focus and its comparison to other cytoskeletal compaction events will be of broad scientific nterest.

    1. Reviewer #2 (Public review):

      Summary:

      Building on previous models of multisensory integration (including their earlier correlation-detection framework used for non-spatial signals), the author introduces a population-level Multisensory Correlation Detector (MCD) that processes raw auditory and visual data. Crucially, it does not rely on abstracted parameters, as is common in normative Bayesian models," but rather works directly on the stimulus itself (i.e., individual pixels and audio samples). By systematically testing the model against a range of experiments spanning human, monkey, and rat data - the authors show that their MCD population approach robustly predicts perception and behavior across species with a relatively small (0-4) number of free parameters.

      Strengths:

      (1) Unlike prior Bayesian models that used simplified or parameterized inputs, the model here is explicitly computable from full natural stimuli. This resolves a key gap in understanding how the brain might extract "time offsets" or "disparities" from continuously changing audio-visual streams.

      (2) The same population MCD architecture captures a remarkable range of multisensory phenomena, from classical illusions (McGurk, ventriloquism) and synchrony judgments, to attentional/gaze behavior driven by audio-visual salience. This generality strongly supports the idea that a single low-level computation (correlation detection) can underlie many distinct multisensory effects.

      (3) By tuning model parameters to different temporal rhythms (e.g., faster in rodents, slower in humans), the MCD explains cross-species perceptual data without reconfiguring the underlying architecture.

      (4) The authors frame their model as a plausible algorithmic account of the Bayesian multisensory-integration models in Marr's levels of hierarchy.

      Weaknesses:

      What remains unclear is how the parameters themselves relate to stimulus quantities (like stimulus uncertainty), as is often straightforward in Bayesian models. A theoretical missing link is the explicit relationship between the parameters of the MCD models and those of a cue combination model, thereby bridging Marr's levels of hierarchy.

      Likely Impact and Usefulness

      The work offers a compelling unification of multiple multisensory tasks-temporal order judgments, illusions, Bayesian causal inference, and overt visual attention-under a single, fully stimulus-driven framework. Its success with natural stimuli should interest computational neuroscientists, systems neuroscientists, and machine learning scientists. This paper thus makes an important contribution to the field by moving beyond minimalistic lab stimuli, illustrating how raw audio and video can be integrated using elementary correlation analyses.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Parise presents another instantiation of the Multisensory Correlation Detector model that can now accept stimulus-level inputs. This is a valuable development as it removes researcher involvement in the characterization/labeling of features and allows analysis of complex stimuli with a high degree of nuance that was previously unconsidered (i.e., spatial/spectral distributions across time). The author demonstrates the power of the model by fitting data from dozens of previous experiments, including multiple species, tasks, behavioral modalities, and pharmacological interventions.

      Thanks for the kind words!

      Strengths:

      One of the model's biggest strengths, in my opinion, is its ability to extract complex spatiotemporal co-relationships from multisensory stimuli. These relationships have typically been manually computed or assigned based on stimulus condition and often distilled to a single dimension or even a single number (e.g., "-50 ms asynchrony"). Thus, many models of multisensory integration depend heavily on human preprocessing of stimuli, and these models miss out on complex dynamics of stimuli; the lead modality distribution apparent in Figures 3b and c is provocative. I can imagine the model revealing interesting characteristics of the facial distribution of correlation during continuous audiovisual speech that have up to this point been largely described as "present" and almost solely focused on the lip area.

      Another aspect that makes the MCD stand out among other models is the biological inspiration and generalizability across domains. The model was developed to describe a separate process - motion perception - and in a much simpler organism - Drosophila. It could then describe a very basic neural computation that has been conserved across phylogeny (which is further demonstrated in the ability to predict rat, primate, and human data) and brain area. This aspect makes the model likely able to account for much more than what has already been demonstrated with only a few tweaks akin to the modifications described in this and previous articles from Parise.

      What allows this potential is that, as Parise and colleagues have demonstrated in those papers since our (re)introduction of the model in 2016, the MCD model is modular - both in its ability to interface with different inputs/outputs and its ability to chain MCD units in a way that can analyze spatial, spectral, or any other arbitrary dimension of a stimulus. This fact leaves wide open the possibilities for types of data, stimuli, and tasks a simplistic, neutrally inspired model can account for.

      And so it's unsurprising (but impressive!) that Parise has demonstrated the model's ability here to account for such a wide range of empirical data from numerous tasks (synchrony/temporal order judgement, localization, detection, etc.) and behavior types (manual/saccade responses, gaze, etc.) using only the stimulus and a few free parameters. This ability is another of the model's main strengths that I think deserves some emphasis: it represents a kind of validation of those experiments, especially in the context of cross-experiment predictions (but see some criticism of that below).

      Finally, what is perhaps most impressive to me is that the MCD (and the accompanying decision model) does all this with very few (sometimes zero) free parameters. This highlights the utility of the model and the plausibility of its underlying architecture, but also helps to prevent extreme overfitting if fit correctly (but see a related concern below).

      We sincerely thank the reviewer for their thoughtful and generous comments. We are especially pleased that the core strengths of the model—its stimulus-computable architecture, biological grounding, modularity, and cross-domain applicability—were clearly recognized. As the reviewer rightly notes, removing researcher-defined abstractions and working directly from naturalistic stimuli opens the door to uncovering previously overlooked dynamics in complex multisensory signals, such as the spatial and temporal richness of audiovisual speech.

      We also appreciate the recognition of the model’s origins in a simple organism and its generalization across species and behaviors. This phylogenetic continuity reinforces our view that the MCD captures a fundamental computation with wide-ranging implications. Finally, we are grateful for the reviewer’s emphasis on the model’s predictive power across tasks and datasets with few or no free parameters—a property we see as key to both its parsimony and explanatory utility.

      We have highlighted these points more explicitly in the revised manuscript, and we thank the reviewer for their generous and insightful endorsement of the work.

      Weaknesses:

      There is an insufficient level of detail in the methods about model fitting. As a result, it's unclear what data the models were fitted and validated on. Were models fit individually or on average group data? Each condition separately? Is the model predictive of unseen data? Was the model cross-validated? Relatedly, the manuscript mentions a randomization test, but the shuffled data produces model responses that are still highly correlated to behavior despite shuffling. Could it be that any stimulus that varies in AV onset asynchrony can produce a psychometric curve that matches any other task with asynchrony judgements baked into the task? Does this mean all SJ or TOJ tasks produce correlated psychometric curves? Or more generally, is Pearson's correlation insensitive to subtle changes here, considering psychometric curves are typically sigmoidal? Curves can be non-overlapping and still highly correlated if one is, for example, scaled differently. Would an error term such as mean-squared or root mean-squared error be more sensitive to subtle changes in psychometric curves? Alternatively, perhaps if the models aren't cross-validated, the high correlation values are due to overfitting?

      The reviewer is right: the current version of the manuscript only provides limited information about parameter fitting. In the revised version of the manuscript, we included a parameter estimation and generalizability section that includes all information requested by the reviewer.

      To test whether using the MSE instead of Pearson correlation led to a similar estimated set of parameter values, we repeated the fitting using the MSE. The parameter estimated with this method (TauV, TauA, TauBim) closely followed those estimated using Pearson correlation (TauV, TauA, TauBim). Given the similarity of these results, we have chosen not to include further figures, however this analysis is now included in the new section (pages 23-24).

      Regarding the permutation test, it is expected that different stimuli produce analogous psychometric functions: after all, all studies relied on stimuli containing identical manipulation of lags. As a result, MCD population responses tend to be similar across experiments. Therefore, it is not a surprise that the permuted distribution of MCD-data correlation in Supplementary Figure 1K has a mean as high as 0.97. However, what is important is to demonstrate that the non-permuted dataset has an even higher goodness of fit. Supplementary Figure 1K demonstrates that none of the permuted stimuli could outperform the non-permuted dataset; the mean of the non-permuted distribution is 4.7 (standard deviations) above the mean of the already high  permuted distribution.

      We believe the new section, along with the present response, fully addresses the legitimate concerns of the reviewer.

      While the model boasts incredible versatility across tasks and stimulus configurations, fitting behavioral data well doesn't mean we've captured the underlying neural processes, and thus, we need to be careful when interpreting results. For example, the model produces temporal parameters fitting rat behavior that are 4x faster than when fitting human data. This difference in slope and a difference at the tails were interpreted as differences in perceptual sensitivity related to general processing speeds of the rat, presumably related to brain/body size differences. While rats no doubt have these differences in neural processing speed/integration windows, it seems reasonable that a lot of the differences in human and rat psychometric functions could be explained by the (over)training and motivation of rats to perform on every trial for a reward - increasing attention/sensitivity (slope) - and a tendency to make mistakes (compression evident at the tails). Was there an attempt to fit these data with a lapse parameter built into the decisional model as was done in Equation 21? Likewise, the fitted parameters for the pharmacological manipulations during the SJ task indicated differences in the decisional (but not the perceptual) process and the article makes the claim that "all pharmacologically-induced changes in audiovisual time perception" can be attributed to decisional processes "with no need to postulate changes in low-level temporal processing." However, those papers discuss actual sensory effects of pharmacological manipulation, with one specifically reporting changes to response timing. Moreover, and again contrary to the conclusions drawn from model fits to those data, both papers also report a change in psychometric slope/JND in the TOJ task after pharmacological manipulation, which would presumably be reflected in changes to the perceptual (but not the decisional) parameters.

      Fitting or predicting behaviour does not in itself demonstrate that a model captures the underlying neural computations—though it may offer valuable constraints and insights. In line with this, we were careful not to extrapolate the implications of our simulations to specific neural mechanisms.

      Temporal sensitivity is, by definition, a behavioural metric, and—as the reviewer correctly notes—its estimation may reflect a range of contributing factors beyond low-level sensory processing, including attention, motivation, and lapse rates (i.e., stimulus-independent errors). In Equation 21, we introduced a lapse parameter specifically to account for such effects in the context of monkey eye-tracking data. For the rat datasets, however, the inclusion of a lapse term was not required to achieve a close fit to the psychometric data (ρ = 0.981). While it is likely that adding a lapse component would yield a marginally better fit, the absence of single-trial data prevents us from applying model comparison criteria such as AIC or BIC to justify the additional parameter. In light of this, and to avoid unnecessary model complexity, we opted not to include a lapse term in the rat simulations.

      With respect to the pharmacological manipulation data, we acknowledge the reviewer’s point that observed changes in slope and bias could plausibly arise from alterations at either the sensory or decisional level—or both. In our model, low-level sensory processing is instantiated by the MCD architecture, which outputs the MCDcorr and MCDlag signals that are then scaled and integrated during decision-making. Importantly, this scaling operation influences the slope of the resulting psychometric functions, such that changes in slope can arise even in the absence of any change to the MCD’s temporal filters. In our simulations, the temporal constants of the MCD units were fixed to the values estimated from the non-pharmacological condition (see parameter estimation section above), and only the decision-related parameters were allowed to vary. From this modelling perspective, the behavioural effects observed in the pharmacological datasets can be explained entirely by changes at the decisional level. However, we do not claim that such an explanation excludes the possibility of genuine sensory-level changes. Rather, we assert that our model can account for the observed data without requiring modifications to early temporal tuning.

      To rigorously distinguish sensory from decisional effects, future experiments will need to employ stimuli with richer temporal structure—e.g., temporally modulated sequences of clicks and flashes that vary in frequency, phase, rhythm, or regularity (see Fujisaki & Nishida, 2007; Denison et al., 2012; Parise & Ernst, 2016, 2025; Locke & Landy, 2017; Nidiffer et al., 2018). Such stimuli engage the MCD in a more stimulus-dependent manner, enabling a clearer separation between early sensory encoding and later decision-making processes. Unfortunately, the current rat datasets—based exclusively on single click-flash pairings—lack the complexity needed for such disambiguation. As a result, while our simulations suggest that the observed pharmacologically induced effects can be attributed to changes in decision-level parameters, they do not rule out concurrent sensory-level changes.

      In summary, our results indicate that changes in the temporal tuning of MCD units are not necessary to reproduce the observed pharmacological effects on audiovisual timing behaviour. However, we do not assert that such changes are absent or unnecessary in principle. Disentangling sensory and decisional contributions will ultimately require richer datasets and experimental paradigms designed specifically for this purpose. We have now modified the results section (page 6) and the discussion (page 11) to clarify these points.

      The case for the utility of a stimulus-computable model is convincing (as I mentioned above), but its framing as mission-critical for understanding multisensory perception is overstated, I think. The line for what is "stimulus computable" is arbitrary and doesn't seem to be followed in the paper. A strict definition might realistically require inputs to be, e.g., the patterns of light and sound waves available to our eyes and ears, while an even more strict definition might (unrealistically) require those stimuli to be physically present and transduced by the model. A reasonable looser definition might allow an "abstract and low-dimensional representation of the stimulus, such as the stimulus envelope (which was used in the paper), to be an input. Ultimately, some preprocessing of a stimulus does not necessarily confound interpretations about (multi)sensory perception. And on the flip side, the stimulus-computable aspect doesn't necessarily give the model supreme insight into perception. For example, the MCD model was "confused" by the stimuli used in our 2018 paper (Nidiffer et al., 2018; Parise & Ernst, 2025). In each of our stimuli (including catch trials), the onset and offset drove strong AV temporal correlations across all stimulus conditions (including catch trials), but were irrelevant to participants performing an amplitude modulation detection task. The to-be-detected amplitude modulations, set at individual thresholds, were not a salient aspect of the physical stimulus, and thus only marginally affected stimulus correlations. The model was of course, able to fit our data by "ignoring" the on/offsets (i.e., requiring human intervention), again highlighting that the model is tapping into a very basic and ubiquitous computational principle of (multi)sensory perception. But it does reveal a limitation of such a stimulus-computable model: that it is (so far) strictly bottom-up.

      We appreciate the reviewer’s thoughtful engagement with the concept of stimulus computability. We agree that the term requires careful definition and should not be taken as a guarantee of perceptual insight or neural plausibility. In our work, we define a model as “stimulus-computable” if all its inputs are derived directly from the stimulus, rather than from experimenter-defined summary descriptors such as temporal lag, spatial disparity, or cue reliability. In the context of multisensory integration, this implies that a model must account not only for how cues are combined, but also for how those cues are extracted from raw inputs—such as audio waveforms and visual contrast sequences.

      This distinction is central to our modelling philosophy. While ideal observer models often specify how information should be combined once identified, they typically do not address the upstream question of how this information is extracted from sensory input. In that sense, models that are not stimulus-computable leave out a key part of the perceptual pipeline. We do not present stimulus computability as a marker of theoretical superiority, but rather as a modelling constraint that is necessary if one’s aim is to explain how structured sensory input gives rise to perception. This is a view that is also explicitly acknowledged and supported by Reviewer 2.

      Framed in Marr’s (1982) terms, non–stimulus-computable models tend to operate at the computational level, defining what the system is doing (e.g., computing a maximum likelihood estimate), whereas stimulus-computable models aim to function at the algorithmic level, specifying how the relevant representations and operations might be implemented. When appropriately constrained by biological plausibility, such models may also inform hypotheses at the implementational level, pointing to potential neural substrates that could instantiate the computation.

      Regarding the reviewer’s example illustrating a limitation of the MCD model, we respectfully note that the account appears to be based on a misreading of our prior work. In Parise & Ernst (2025), where we simulated the stimuli from Nidiffer et al. (2018), the MCD model reproduced participants’ behavioural data without any human intervention or adjustment. The model was applied in a fully bottom-up, stimulus-driven manner, and its output aligned with observer responses as-is. We suspect the confusion may stem from analyses shown in Figure 6 - Supplement Figure 5 of Parise & Ernst (2025), where we investigated the lack of a frequency-doubling effect in the Nidiffer et al. data. However, those analyses were based solely on the Pearson correlation between auditory and visual stimulus envelopes and did not involve the MCD model. No manual exclusion of onset/offset events was applied, nor was the MCD used in those particular figures. We also note that Parise & Ernst (2025) is a separate, already published study and is not the manuscript currently under review. 

      In summary, while we fully agree that stimulus computability does not resolve all the complexities of multisensory perception (see comments below about speech), we maintain that it provides a valuable modelling constraint—one that enables robust, generalisable predictions when appropriately scoped. 

      The manuscript rightly chooses to focus a lot of the work on speech, fitting the MCD model to predict behavioral responses to speech. The range of findings from AV speech experiments that the MCD can account for is very convincing. Given the provided context that speech is "often claimed to be processed via dedicated mechanisms in the brain," a statement claiming a "first end-to-end account of multisensory perception," and findings that the MCD model can account for speech behaviors, it seems the reader is meant to infer that energetic correlation detection is a complete account of speech perception. I think this conclusion misses some facets of AV speech perception, such as integration of higher-order, non-redundant/correlated speech features (Campbell, 2008) and also the existence of top-down and predictive processing that aren't (yet!) explained by MCD. For example, one important benefit of AV speech is interactions on linguistic processes - how complementary sensitivity to articulatory features in the auditory and visual systems (Summerfield, 1987) allow constraint of linguistic processes (Peelle & Sommers, 2015; Tye-Murray et al., 2007).

      We thank the reviewer for their thoughtful comments, and especially for the kind words describing the range of findings from our AV speech simulations as “very convincing.”

      We would like to clarify that it is not our view that speech perception can be reduced to energetic correlation detection. While the MCD model captures low- to mid-level temporal dependencies between auditory and visual signals, we fully agree that a complete account of audiovisual speech perception must also include higher-order processes—including linguistic mechanisms and top-down predictions. These are critical components of AV speech comprehension, and lie beyond the scope of the current model.

      Our use of the term “end-to-end” is intended in a narrow operational sense: the model transforms raw audiovisual input (i.e., audio waveforms and video frames) directly into behavioural output (i.e., button press responses), without reliance on abstracted stimulus parameters such as lag, disparity or reliability. It is in this specific technical sense that the MCD offers an end-to-end model. We have revised the manuscript to clarify this usage to avoid any misunderstanding.

      In light of the reviewer’s valuable point, we have now edited the Discussion to acknowledge the importance of linguistic processes (page 13) and to clarify what we mean by end-to-end account (page 11). We agree that future work will need to explore how stimulus-computable models such as the MCD can be integrated with broader frameworks of linguistic and predictive processing (e.g., Summerfield, 1987; Campbell, 2008; Peelle & Sommers, 2015; Tye-Murray et al., 2007).

      References

      Campbell, R. (2008). The processing of audio-visual speech: empirical and neural bases. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1493), 1001-1010. https://doi.org/10.1098/rstb.2007.2155

      Nidiffer, A. R., Diederich, A., Ramachandran, R., & Wallace, M. T. (2018). Multisensory perception reflects individual differences in processing temporal correlations. Scientific Reports 2018 8:1, 8(1), 1-15. https://doi.org/10.1038/s41598-018-32673-y

      Parise, C. V, & Ernst, M. O. (2025). Multisensory integration operates on correlated input from unimodal transient channels. ELife, 12. https://doi.org/10.7554/ELIFE.90841

      Peelle, J. E., & Sommers, M. S. (2015). Prediction and constraint in audiovisual speech perception. Cortex, 68, 169-181. https://doi.org/10.1016/j.cortex.2015.03.006

      Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R. Campbell (Eds.), Hearing by Eye: The Psychology of Lip-Reading (pp. 3-51). Lawrence Erlbaum Associates.

      Tye-Murray, N., Sommers, M., & Spehar, B. (2007). Auditory and Visual Lexical Neighborhoods in Audiovisual Speech Perception: Trends in Amplification, 11(4), 233-241. https://doi.org/10.1177/1084713807307409

      Reviewer #2 (Public review):

      Summary:

      Building on previous models of multisensory integration (including their earlier correlation-detection framework used for non-spatial signals), the author introduces a population-level Multisensory Correlation Detector (MCD) that processes raw auditory and visual data. Crucially, it does not rely on abstracted parameters, as is common in normative Bayesian models," but rather works directly on the stimulus itself (i.e., individual pixels and audio samples). By systematically testing the model against a range of experiments spanning human, monkey, and rat data, the authors show that their MCD population approach robustly predicts perception and behavior across species with a relatively small (0-4) number of free parameters.

      Strengths:

      (1) Unlike prior Bayesian models that used simplified or parameterized inputs, the model here is explicitly computable from full natural stimuli. This resolves a key gap in understanding how the brain might extract "time offsets" or "disparities" from continuously changing audio-visual streams.

      (2) The same population MCD architecture captures a remarkable range of multisensory phenomena, from classical illusions (McGurk, ventriloquism) and synchrony judgments, to attentional/gaze behavior driven by audio-visual salience. This generality strongly supports the idea that a single low-level computation (correlation detection) can underlie many distinct multisensory effects.

      (3) By tuning model parameters to different temporal rhythms (e.g., faster in rodents, slower in humans), the MCD explains cross-species perceptual data without reconfiguring the underlying architecture.

      We thank the reviewer for their positive evaluation of the manuscript, and particularly for highlighting the significance of the model's stimulus-computable architecture and its broad applicability across species and paradigms. Please find our responses to the individual points below.

      Weaknesses:

      (1) The authors show how a correlation-based model can account for the various multisensory integration effects observed in previous studies. However, a comparison of how the two accounts differ would shed light on the correlation model being an implementation of the Bayesian computations (different levels in Marr's hierarchy) or making testable predictions that can distinguish between the two frameworks. For example, how uncertainty in the cue combined estimate is also the harmonic mean of the unimodal uncertainties is a prediction from the Bayesian model. So, how the MCD framework predicts this reduced uncertainty could be one potential difference (or similarity) to the Bayesian model.

      We fully agree with the reviewer that a comparison between the correlation-based MCD model and Bayesian accounts is valuable—particularly for clarifying how the two frameworks differ conceptually and where they may converge.

      As noted in the revised manuscript, the key distinction lies in the level of analysis described by Marr (1982). Bayesian models operate at the computational level, describing what the system is aiming to compute (e.g., optimal cue integration). In contrast, the MCD functions at the algorithmic level, offering a biologically plausible mechanism for how such integration might emerge from stimulus-driven representations.

      In this context, the MCD provides a concrete, stimulus-grounded account of how perceptual estimates might be constructed—potentially implementing computations with Bayesian-like characteristics (e.g., reduced uncertainty, cue weighting). Thus, the two models are not mutually exclusive but can be seen as complementary: the MCD may offer an algorithmic instantiation of computations that, at the abstract level, resemble Bayesian inference.

      We have now updated the manuscript to explicitly highlight this relationship (pages 2 and 11). In the revised manuscript, we also included a new figure (Figure 5) and movie (Supplementary Movie 3), to show how the present approach extends previous Bayesian models for the case of cue integration (i.e., the ventriloquist effect).

      (2) The authors show a good match for cue combination involving 2 cues. While Bayesian accounts provide a direction for extension to more cues (also seen empirically, for eg, in Hecht et al. 2008), discussion on how the MCD model extends to more cues would benefit the readers.

      We thank the reviewer for this insightful comment: extending the MCD model to include more than two sensory modalities is a natural and valuable next step. Indeed, one of the strengths of the MCD framework lies in its modularity. Let us consider the MCDcorr​ output (Equation 6), which is computed as the pointwise product of transient inputs across modalities. Extending this to include a third modality, such as touch, is straightforward: MCD units would simply multiply the transient channels from all three modalities, effectively acting as trimodal coincidence detectors that respond when all inputs are aligned in time and space.

      By contrast, extending MCDlag is less intuitive, due to its reliance on opponency between two subunits (via subtraction). A plausible solution is to compute MCDlag in a pairwise fashion (e.g., AV, VT, AT), capturing relative timing across modality pairs.

      Importantly, the bulk of the spatial integration in our framework is carried by MCDcorr, which generalises naturally to more than two modalities. We have now formalised this extension and included a graphical representation in a supplementary section of the revised manuscript.

      Likely Impact and Usefulness:

      The work offers a compelling unification of multiple multisensory tasks- temporal order judgments, illusions, Bayesian causal inference, and overt visual attention - under a single, fully stimulus-driven framework. Its success with natural stimuli should interest computational neuroscientists, systems neuroscientists, and machine learning scientists. This paper thus makes an important contribution to the field by moving beyond minimalistic lab stimuli, illustrating how raw audio and video can be integrated using elementary correlation analyses.

      Reviewer #1 (Recommendations for the authors):

      Recommendations:

      My biggest concern is a lack of specificity about model fitting, which is assuaged by the inclusion of sufficient detail to replicate the analysis completely or the inclusion of the analysis code. The code availability indicates a script for the population model will be included, but it is unclear if this code will provide the fitting details for the whole of the analysis.

      We thank the reviewer for raising this important point. A new methodological section has been added to the manuscript, detailing the model fitting procedures used throughout the study. In addition, the accompanying code repository now includes MATLAB scripts that allow full replication of the spatiotemporal MCD simulations.

      Perhaps it could be enlightening to re-evaluate the model with a measure of error rather than correlation? And I think many researchers would be interested in the model's performance on unseen data.

      The model has now been re-evaluated using mean squared error (MSE), and the results remain consistent with those obtained using Pearson correlation. Additionally, we have clarified which parts of the study involve testing the model on unseen data (i.e., data not used to fit the temporal constants of the units). These analyses are now included and discussed in the revised fitting section of the manuscript (pages 23-24).

      Otherwise, my concerns involve the interpretation of findings, and thus could be satisfied with minor rewording or tempering conclusions.

      The manuscript has been revised to address these interpretative concerns, with several conclusions reworded or tempered accordingly. All changes are marked in blue in the revised version.

      Miscellanea:

      Should b0 in equation 10 be bcrit to match the below text?

      Thank you for catching this inconsistency. We have corrected Equation 10 (and also Equation 21) to use the more transparent notation bcrit instead of b0, in line with the accompanying text.

      Equation 23, should time be averaged separately? For example, if multiple people are speaking, the average correlation for those frames will be higher than the average correlation across all times.

      We thank the reviewer for raising this thoughtful and important point. In response, we have clarified the notation of Equation 23 in the revised manuscript (page 20). Specifically, we now denote the averaging operations explicitly as spatial means and standard deviations across all pixel locations within each frame.

      This equation computes the z-score of the MCD correlation value at the current gaze location, normalized relative to the spatial distribution of correlation values in the same frame. That is, all operations are performed at the frame level, not across time. This ensures that temporally distinct events are treated independently and that the final measure reflects relative salience within each moment, not a global average over the stimulus. In other words, the spatial distribution of MCD activity is re-centered and rescaled at each frame, exactly to avoid the type of inflation or confounding the reviewer rightly cautioned against.

      Reviewer #2 (Recommendations for the authors):

      The authors have done a great job of providing a stimulus computable model of cue combination. I had just a few suggestions to strengthen the theoretical part of the paper:

      (1) While the authors have shown a good match between MCD and cue combination, some theoretical justification or equivalence analysis would benefit readers on how the two relate to each other. Something like Zhang et al. 2019 (which is for motion cue combination) would add to the paper.

      We agree that it is important to clarify the theoretical relationship between the Multisensory Correlation Detector (MCD) and normative models of cue integration, such as Bayesian combination. In the revised manuscript, we have now modified the introduction and added a paragraph in the Discussion addressing this link more explicitly. In brief, we see the MCD as an algorithmic-level implementation (in Marr’s terms) that may approximate or instantiate aspects of Bayesian inference.

      (2) Simulating cue combination for tasks that require integration of more than two cues (visual, auditory, haptic cues) would more strongly relate the correlation model to Bayesian cue combination. If that is a lot of work, at least discussing this would benefit the paper

      This point has now been addressed, and a new paragraph discussing the extension of the MCD model to tasks involving more than two sensory modalities has been added to the Discussion section.

    1. Reviewer #2 (Public review):

      Summary:

      This manuscript is a technical report on a new model of early neurogenesis, coupled to a novel platform for genetic screens. The model is more faithful than others published to date, and the screening platform is an advance over existing ones in terms of speed and throughput.

      Strengths:

      It is novel and useful.

      Weaknesses:

      The novelty of the results is limited in terms of biology, mainly a proof of concept of the platform and a very good demonstration of the hierarchical interactions of the top regulators of GRNs.

      The value of the manuscript could be enhanced in two ways:

      (1) by showing its versatility and transforming the level of neural tube to midbrain and hindbrain, and looking at the transcriptional hierarchies there.

      (2) by relating the patterning of the organoids to the situation in vivo, in particular with the information in reference 49. The authors make a statement "To compare our findings with in vivo gene expression patterns, we applied the same approach to published scRNA-seq data from 4-week-old human embryos at the neurula stage" but it would be good to have a more nuanced reference: what stage, what genes are missing, what do they add to the information in that reference?

    1. Why is this study unethical? How were the men who participated and their families harmed as a function of this research?

      This study is unethical because they didn’t spread awareness to those who were participates and affected there families

    2. Why is this study unethical? How were the men who participated and their families harmed as a function of this research?

      This study was unethical because they weren’t honest because they put not just the participates at harm but there families

    1. Reviewer #1 (Public review):

      This study established a C921Y OGT-ID mouse model, systematically demonstrating in mammals the pathological link between O-GlcNAc metabolic imbalance and neurodevelopmental disorders (cortical malformation, microcephaly) as well as behavioral abnormalities (hyperactivity, impulsivity, learning/memory deficits). However, critical flaws in the current findings require resolution to ensure scientific rigor.

      The most concerning finding appears in Figure S12. While Supplementary Figure S12 demonstrates decreased OGA expression without significant OGT level changes in C921Y mutants via Western blot/qPCR, previous reports (Florence Authier, et al., Dis Model Mech. 2023) described OGT downregulation in Western blot and an increase in qPCR in the same models. The opposite OGT expression outcomes in supposedly identical mouse models directly challenge the model's reliability. This discrepancy raises serious concerns about either the experimental execution or the interpretation of results. The authors must revalidate the data with rigorous controls or provide a molecular biology-based explanation.

      A few additional comments to the author may be helpful to improve the study.

      Major

      (1) While this study systematically validated multi-dimensional phenotypes (including neuroanatomical abnormalities and behavioral deficits) in OGT C921Y mutant mice, there is a lack of relevant mechanisms and intervention experiments. For example, the absence of targeted intervention studies on key signaling pathways prevents verification of whether proteomics-identified molecular changes directly drive phenotypic manifestations.

      (2) Although MRI detected nodular dysplasia and heterotopia in the cingulate cortex, the cellular basis remains undefined. Spatiotemporal immunofluorescence analysis using neuronal (NeuN), astrocytic (GFAP), and synaptic (Synaptophysin) markers is recommended to identify affected cell populations (e.g., radial glial migration defects or intermediate progenitor differentiation abnormalities).

      (3) While proteomics revealed dysregulation in pathways including Wnt/β-catenin and mTOR signaling, two critical issues remain unresolved: a) O-GlcNAc glycoproteomic alterations remain unexamined; b) The causal relationship between pathway changes and O-GlcNAc imbalance lacks validation. It is recommended to use co-immunoprecipitation or glycosylation sequencing to confirm whether the relevant proteins undergo O-GlcNAc modification changes, identify specific modification sites, and verify their interactions with OGT.

      (4) Given that OGT-ID neuropathology likely originates embryonically, we recommend serial analyses from E14.5 to P7 to examine cellular dynamics during critical corticogenesis phases.

      (5) The interpretation of Figure 8A constitutes overinterpretation. Current data fail to conclusively demonstrate impairment of OGT's protein interaction network and lack direct evidence supporting the proposed mechanisms of HCF1 misprocessing or OGA loss.

    2. Author response:

      Reviewer #1 (Public review):

      This study established a C921Y OGT-ID mouse model, systematically demonstrating in mammals the pathological link between O-GlcNAc metabolic imbalance and neurodevelopmental disorders (cortical malformation, microcephaly) as well as behavioral abnormalities (hyperactivity, impulsivity, learning/memory deficits). However, critical flaws in the current findings require resolution to ensure scientific rigor.

      The most concerning finding appears in Figure S12. While Supplementary Figure S12 demonstrates decreased OGA expression without significant OGT level changes in C921Y mutants via Western blot/qPCR, previous reports (Florence Authier, et al., Dis Model Mech. 2023) described OGT downregulation in Western blot and an increase in qPCR in the same models. The opposite OGT expression outcomes in supposedly identical mouse models directly challenge the model's reliability. This discrepancy raises serious concerns about either the experimental execution or the interpretation of results. The authors must revalidate the data with rigorous controls or provide a molecular biology-based explanation.

      The referee’s assessment is based on a misunderstanding – these are certainly not the same experiment repeated twice with different answers. In the previous report of the OGT-C921Y mutant mice (Florence Authier, et al., Dis Model Mech. 2023), OGT and OGA mRNA/protein expression have been assessed in total brain protein extract from 3 months old male mice. In that study we observed a significant reduction in OGT protein levels while OGT mRNA levels were significantly increased in the mutant compared to WT controls. However, in our the current study (Figure S12), OGA and OGT mRNA/protein expression have been a) restricted to the pre-frontal cortex and b) are from 4 months old male mice, which does not allow a direct comparison of the two studies. In the pre-frontal cortex, OGT protein levels are not changed while OGT mRNA levels are increased (similarly to the total brain data), albeit not significantly. The different outcomes of OGT protein levels in both total brain and prefrontal cortex could suggest regional differences in OGT protein levels/stability as OGT mRNA levels are increased in both cases. Three other brain regions (hippocampus, striatum and cerebellum) have now also been assessed for OGT mRNA/protein expression, supporting such regional differences in OGT protein levels and these data will be included in the new version of the manuscript.

      A few additional comments to the author may be helpful to improve the study.

      Major

      (1) While this study systematically validated multi-dimensional phenotypes (including neuroanatomical abnormalities and behavioral deficits) in OGT C921Y mutant mice, there is a lack of relevant mechanisms and intervention experiments. For example, the absence of targeted intervention studies on key signaling pathways prevents verification of whether proteomics-identified molecular changes directly drive phenotypic manifestations.

      We agree with the referee that these experiments would further strenghten the work. They would, however, result in a 1-5 year delay in sharing this work with the scientific and patient communities. We will continue to work along these lines and report separately in the future.

      (2) Although MRI detected nodular dysplasia and heterotopia in the cingulate cortex, the cellular basis remains undefined. Spatiotemporal immunofluorescence analysis using neuronal (NeuN), astrocytic (GFAP), and synaptic (Synaptophysin) markers is recommended to identify affected cell populations (e.g., radial glial migration defects or intermediate progenitor differentiation abnormalities).

      We are currently performing these experiments so that they can be included in the version of record of this manuscript.

      (3) While proteomics revealed dysregulation in pathways including Wnt/β-catenin and mTOR signaling, two critical issues remain unresolved: a) O-GlcNAc glycoproteomic alterations remain unexamined; b) The causal relationship between pathway changes and O-GlcNAc imbalance lacks validation. It is recommended to use co-immunoprecipitation or glycosylation sequencing to confirm whether the relevant proteins undergo O-GlcNAc modification changes, identify specific modification sites, and verify their interactions with OGT.

      We agree with the referee that these experiments would further strenghten the work and will perform further experiments to explore whether these pathways are functionally affected. However, it is important to note that the inference that these proteins must themselves be O-GlcNAc modified is incorrect – indeed, O-GlcNAcylation of unknown protein kinase X, E3 ligase/DUB, Y or transcription factor Z could indirectly affect these pathways/proteins.

      (4) Given that OGT-ID neuropathology likely originates embryonically, we recommend serial analyses from E14.5 to P7 to examine cellular dynamics during critical corticogenesis phases.

      We agree with the referee that these experiments would further strenghten the work. They would, however, result in a significant delay in sharing this work with the scientific and patient communities. We will continue to work along these lines and report separately in the future.

      (5) The interpretation of Figure 8A constitutes overinterpretation. Current data fail to conclusively demonstrate impairment of OGT's protein interaction network and lack direct evidence supporting the proposed mechanisms of HCF1 misprocessing or OGA loss.

      For clarity, we will remove panel A from Figure 8 in the version of record – this panel was only ever meant to represent a priori hypotheses for OGT-CDG mechanisms, none of which have been either excluded or confirmed.

      Reviewer #2 (Public review):

      Summary:

      The authors are trying to understand why certain mutants of O-GlcNAc transferase (OGT) appear to cause developmental disorders in humans. As an important step towards that goal, the authors generated a mouse model with one of these mutations that disrupts OGT activity. They then go on to test these mice for behavioral differences, finding that the mutant mice exhibit some signs of hyperactivity and differences in learning and memory. They then examine alterations to the structure of the brain and skull, and again find changes in the mutant mice that have been associated with developmental disorders. Finally, they identify proteins that are up- or down-regulated between the two mice as potential mechanisms to explain the observations.

      Strengths:

      The major strength of this manuscript is the creation of this mouse model, as a key step in beginning to understand how OGT mutants cause developmental disorders. This line will prove important for not only the authors but other investigators as well, enabling the testing of various hypotheses and potentially treatments. The experiments are also rigorously performed, and the conclusions are well supported by the data.

      Weaknesses:

      The only weakness identified is a lack of mechanistic insight. However, this certainly may come in the future through more targeted experimentation using this mouse model.

      We agree with the referee that these experiments would further strenghten the work. They would, however, result in a 1-5 year delay in sharing this work with the scientific and patient communities. We will continue to work along these lines and report separately in the future.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript presents a robust set of experiments that provide new fundamental insights into the role of STN neurons during active and passive avoidance tasks. These forms of avoidance have received comparatively less attention in the literature than the more extensively studied escape or freezing responses, despite being extremely relevant to human behaviour and more strongly influenced by cognitive control.

      Strengths:

      Understanding the neural infrastructure supporting avoidance behaviour would be a fundamental milestone in neuroscience. The authors employ sophisticated methods, including calcium imaging and optogenetics, to delineate the functions of STN neurons during avoidance behaviours. The work is extremely thorough, and the evidence presented is compelling. Experiments are carefully constructed, well-controlled, and the statistical analyses are appropriate.

      Points for Authors' Consideration:

      (1) Motoric role of STN:<br /> The authors interpret their findings within the context of active avoidance, a cognitively demanding process. An alternative interpretation is that STN activation enhances global motoric tone, facilitating general movement rather than specifically encoding cautious avoidance. Experimentally, this could be evaluated by examining STN-induced motoric tone in non-avoidance contexts, such as open field tests with bilateral stimulations. Alternatively, or additionally, the authors could explicitly discuss evidence for and against the possibility that increased motoric tone may account for aspects of the observed behaviours.

      (2) Temporal Dynamics in Calcium Imaging (AA2 vs. AA1):<br /> Based on previous work by this group, a delay (~1-2 sec) in neuronal response onset was anticipated in AA2 compared to AA1. Although a delay in peak response is observed, there is no clear evidence of a significant delay in response onset or changes in slope of neural activity. The authors could quantify calcium onset latencies and slopes and statistically compare these parameters across conditions.

      (3) Speed Differences (AA2 vs. AA1):<br /> Given the increased latency in AA2, and based on previous work from the group, one would expect faster movements following initiation. However, such differences are not evident in the presented data. The authors might want to discuss the absence of an expected speed increase and clarify whether this absence is consistent with previous findings.

      (4) Behavioural Differences Across Neuronal Classes (Figure 7):<br /> The manuscript currently does not compare responses of neuronal classes I, II, and III between AA1 and AA2 conditions separately or provide information regarding their activity during AA3.

      (5) Streamlining Narrative and Figures:<br /> Given the extensive amount of material presented, the manuscript and figures would benefit from streamlining. Many data points and graphs could be moved to supplementary materials without affecting the core interpretation and simplifying the reading of the work by a non-expert audience. Similarly, the main text could be refined to more clearly emphasise the key findings, which would improve both readability and impact. At the same time, certain aspects would benefit from additional clarification. For example, it would be helpful to explain the key features of the AA1-AA3 tasks at the point of introduction, rather than referring readers to previous literature. Overall, enhancing clarity and accessibility would serve the authors well and broaden the impact of the work.

    2. Reviewer #2 (Public review):

      Summary:

      Zhou, Sajid et al. present a study investigating the STN involvement in signaled movement. They use fiber photometry, implantable lenses, and optogenetics during active avoidance experiments to evaluate this. The data are useful for the scientific community, and the overall evidence for their claims is solid, but many aspects of the findings are confusing and seemingly contradictory. For example, STN activity increases with contraversive turning in the fiber photometry experiments, but optogenetic stimulation of the STN evokes ipsiversive turning. While the authors present a huge collection of data, it is somewhat difficult to extract the key information and the meaningful implications resulting from this data.

      Strengths:

      The study is comprehensive in using many techniques, stimulation powers, frequencies, and configurations.

      Weaknesses:

      Here are the specific weaknesses of the paper.

      (1) Vglut2 isn't a very selective promoter for the STN. Did the authors verify every injection across brain slices to ensure the para-subthalamic nucleus, thalamus, lateral hypothalamus, and other Vglut2-positive structures were never infected?

      (2) The authors say in the methods that the high vs low power laser activation for optogenetic experiments was defined by the behavioral output. This is misleading, and the high vs low power should be objectively stated and the behavioral results divided according to the power used, not according to the behavioral outcome.

      (3) In the fiber photometry experiments exposing mice to the range of tones, it is impossible to separate the STN response to the tone from the STN response to the movement evoked by the tone. The authors should expose the mouse to the tones in a condition that prevents movement, such as anesthetized or restrained, to separate out the two components.

      (4) The claim 'STN activation is ideally suited to drive active avoids' needs more explanation. This claim comes after the fiber photometry experiments during active avoidance tasks, so there has been no causality established yet.

      (5) The statistical comparisons in Figure 7E need some justification and/or clarification. The 9 neuron types are originally categorized based on their response during avoids, then statistics are run showing that they respond differently during avoids. It is no surprise that they would have significantly different responses, since that is how they were classified in the first place. The authors must explain this further and show that this is not a case of circular reasoning.

      (6) The authors show that neurons that have strong responses to orientation show reduced activity during avoidance. What are the implications of this? The author should explain why this is interesting and important.

      (7) It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1 that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presenting CS1+CS2 at the same time and could be confusing.

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      (9) In the discussion, the idea that the STN encodes 'moving away' from contralateral space is pretty vague and unsupported. It is puzzling that the STN activates more strongly to contraversive turns, but when stimulated, it evokes ipsiversive turns; however, it seems a stretch to speculate that this is related to avoidance. In the last experiments of the paper, the axons from the STN to the GPe and to the midbrain are selectively stimulated. Do these evoke ipsiversive turns similarly?

      (10) In the discussion, the authors claim that the STN is essential for modulating action timing in response to demands, but their data really only show this in one direction. The STN stimulation reliably increases the speed of response in all conditions (except maximum speed conditions such as escapes). It seems to be over-interpreting the data to say this is an inability to modulate the speed of the task, especially as clear learning and speed modulation do occur under STN lesion conditions, as shown in Figure 12B. The mice learn to avoid and increase their latency in AA2 vs AA1, though the overall avoids and latency are different from controls. The more parsimonious conclusion would be that STN stimulation biases movement speed (increasing it) and that this is true in many different conditions.

      (11) In the discussion, the authors claim that the STN projections to the midbrain tegmentum directly affect the active avoidance behavior, while the STN projections to the SNr do not affect it. This seems counter to their results, which show STN projections to either area can alter active avoidance behavior. What is the laser power used in these terminal experiments? If it is high (3mW), the authors may be causing antidromic action potentials in the STN somas, resulting in glutamate release in many brain areas, even when terminals are only stimulated in one area. The authors could use low (0.25mW) laser power in the terminals to reduce the chance of antidromic activation and spatially restrict the optical stimulation.

      (12) Was normality tested for data prior to statistical testing?

      (13) Why are there no error bars on Figure 5B, black circles and orange triangles?

    3. Reviewer #3 (Public review):

      Summary:

      The authors use calcium recordings from STN to measure STN activity during spontaneous movement and in a multi-stage avoidance paradigm. They also use optogenetic excitation, optogenetic inhibition, and lesion approaches to increase or decrease the activity of STN during the avoidance paradigm. The paper reports a large amount of data and makes many claims, some seem well supported to this Reviewer, others not so much.

      Strengths:

      Well-supported claims include data showing that during spontaneous movements, especially contraversive ones, STN calcium activity is increased using bulk photometry measurements. Single-cell measures back this claim but also show that it is only a modest minority of STN cells that respond strongly, with most showing no response during movement, and a similar number showing smaller inhibitions during movement.

      Similar data during cued active avoidance procedures show that STN calcium activity sharply increases in response to auditory cues, and during cued movements to avoid a footshock. Optogenetic and lesion experiments are consistent with an important role for STN in generating cue-evoked avoidance. And a strength of these results is that multiple bi-directional approaches were used.

      Weaknesses:

      I found the experimental design and presentation convoluted and the results over-interpreted.

      (1) I really don't understand or accept this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea, or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the work's title).

      (2) Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based the their physiological responses in some experiments (e.g., Figure 7).

      (3) The description and discussion of orienting head movements were not well supported, but were much discussed in the avoidance datasets. The initial speed peaks to cue seem to be the supporting data upon which these claims rest, but nothing here suggests head movement or orientation responses.

      (4) Similar to the last, the authors note in several places, including abstract, the importance of STN in response timing, i.e., particularly when there must be careful or precise timing, but I don't think their data or task design provides a strong basis for this claim.

      (5) I think that other reports show that STN calcium activity is recruited by inescapable foot shock as well. What do these authors see? Is shock, independent of movement, contributing to sharp signals during escapes?

      (6) In particular, and related to the last point, the following work is very relevant and should be cited: https://elifesciences.org/reviewed-preprints/104643#tab-content. Note that the focus of this other paper is on a subset of VGLUT2+ Tac1 neurons in paraSTN, but using VGLUT2-Cre to target STN will target both STN and paraSTN.

      (7) In multiple other instances, claims that were more tangential to the main claims were made without clearly supporting data or statistics. E.g., claim that STN activation is related to translational more than rotational movement; claim that GCaMP and movement responses to auditory cues were small; claims that 'some animals' responded differently without showing individual data.

      (8) In several figures, the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects. The only measure of error shown in many figures relates to trial-to-trial or event variability, which is minimal because, in many cases, it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability. When bar/line plots are used to display data, I recommend showing individual animals where feasible.

      (9) Can the authors consider the extent to which calcium imaging may be better suited to identify increases compared to decreases and how this may affect the results, particularly related to the GRIN data when similar numbers of cells show responses in both directions (e.g., Figure 3)?

      (10) Raw example traces are not provided.

      (11) The timeline of the spontaneous movement and avoidance sessions was not clear, nor was the number of events or sessions per animal nor how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions were, or if or how any of these parameters might influence interpretation of the results.

      (12) It is not clear if or how the spread of expression outside of the target STN was evaluated, and if or how many mice were excluded due to spread or fiber placements.

    1. Reviewer #1 (Public review):

      Summary:

      Previous studies have shown that treatment with 17α-estradiol (a stereoisomer of the 17β-estradiol) extends lifespan in male mice but not in females. The current study by Li et al, aimed to identify cell-specific clusters and populations in the hypothalamus of aged male rats treated with 17α-estradiol (treated for 6 months). This study identifies genes and pathways affected by 17α-estradiol in the aged hypothalamus.

      Strengths:

      Using single-nucleus transcriptomic sequencing (snRNA-seq) on hypothalamus from aged male rats treated with 17α-estradiol they show that 17α-estradiol significantly attenuated age-related increases in cellular metabolism, stress, and decreased synaptic activity in neurons.

      Moreover, sc-analysis identified GnRH as one of the key mediators of 17α-estradiol's effects on energy homeostasis. Furthermore, they show that CRH neurons exhibited a senescent phenotype, suggesting a potential side effect of the 17α-estradiol. These conclusions are supported by supervised clustering by neuropeptides, hormones, and their receptors.

      Weaknesses:

      However, the study has several limitations that reduce the strength of the key claims in the manuscript. In particular:

      (1) The study focused only on males and did not include comparisons with females. However, previous studies have shown that 17α-estradiol extends lifespan in a sex-specific manner in mice, affecting males but not females. Without the comparison with the female data, it's difficult to assess its relevance to the lifespan.

      (2) Its not known whether 17α-estradiol leads to lifespan extension in male rats similar to male mice. Therefore, it is not possible to conclude that the observed effects in the hypothalamus, are linked to the lifespan extension. The manuscript cited in the introduction does not include lifespan data on rats.

      (3) The effect of 17α-estradiol on non-neuronal cells such as microglia and astrocytes is not well described (Fig.1). Previous studies demonstrated that 17α-estradiol reduces microgliosis and astrogliosis in the hypothalamus of aged male mice. Current data suggest that the proportion of oligo, and microglia were increased by the drug treatment, while the proportions of astrocytes were decreased. These data might suggest possible species differences, differences in the treatment regimen, or differences in drug efficiency. This has to be discussed.

      A more detailed analysis of glial cell types within the hypothalamus in response to drug should be provided.

      (4) The conclusion that CRH neurons are going into senescence is not clearly supported by the data. A more detailed analysis of the hypothalamus such as histological examination to assess cellular senescence markers in CRH neurons, is needed to support this claim.

      Revised submission:

      Some of the concerns were addressed in this revised version, and the authors responded and addressed study design limitations in both sexes/ages.

      However, there are still some concerns that were not sufficiently addressed:<br /> While the term "senescent" was changed to "stressed," some histological/ cellular validation of this phenotype is still needed.

      Some discussion on the sex-specific effects of 17α-estradiol in the hypothalamus is still required. Previous studies in mice demonstrated that 17α-estradiol reduced hypothalamic microgliosis and astrogliosis in male but not female UM-HET3 mice.

      Additionally, the provided analysis on astrocytes and microglia is superficial.

    2. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public Review):

      Summary:

      Previous studies have shown that treatment with 17α-estradiol (a stereoisomer of the 17β-estradiol) extends lifespan in male mice but not in females. The current study by Li et al, aimed to identify cell-specific clusters and populations in the hypothalamus of aged male rats treated with 17α-estradiol (treated for 6 months). This study identifies genes and pathways affected by 17α-estradiol in the aged hypothalamus.

      Strengths:

      Using single-nucleus transcriptomic sequencing (snRNA-seq) on the hypothalamus from aged male rats treated with 17α-estradiol they show that 17α-estradiol significantly attenuated age-related increases in cellular metabolism, stress, and decreased synaptic activity in neurons.

      Thanks.

      Moreover, sc-analysis identified GnRH as one of the key mediators of 17α-estradiol's effects on energy homeostasis. Furthermore, they show that CRH neurons exhibited a senescent phenotype, suggesting a potential side effect of the 17α-estradiol. These conclusions are supported by supervised clustering by neuropeptides, hormones, and their receptors.

      Thanks.

      Weaknesses:

      However, the study has several limitations that reduce the strength of the key claims in the manuscript. In particular:

      (1) The study focused only on males and did not include comparisons with females. However, previous studies have shown that 17α-estradiol extends lifespan in a sex-specific manner in mice, affecting males but not females. Without the comparison with the female data, it's difficult to assess its relevance to the lifespan.

      This study was originally designed based on previous findings indicating that lifespan extension is only effective in males, leading to the exclusion of females from the analysis. The primary focus of our research was on the transcriptional changes and serum endocrine alterations induced by 17α-estradiol in aged males compared to untreated aged males. We believe that even in the absence of female subjects, the significant effects of 17α-estradiol on metabolism in the hypothalamus, synapses, and endocrine system remain evident, particularly regarding the expression levels of GnRH and testosterone. Notably, lower overall metabolism, increased synaptic activity, and elevated levels of GnRH and testosterone are strong indicators of health and well-being in males, supporting the validity of our primary conclusions. However, including female controls would enhance the depth of our findings. If female controls were incorporated, we propose redesigning the sample groups to include aged male control, aged female control, aged female treated, aged male treated, as well as young male control, young male treated, young female control, and young female treated. We regret that we cannot provide this data in the short term. Nevertheless, we believe this reviewer’s creative idea presents a valuable avenue for future research on this topic. In this study, we emphasize the role of 17α-estradiol in overall metabolism, synaptic function, GnRH, and testosterone in aged males and underscore the importance of supervised clustering of neuropeptide-secreting neurons in the hypothalamus.

      (2) It is not known whether 17α-estradiol leads to lifespan extension in male rats similar to male mice. Therefore, it is not possible to conclude that the observed effects in the hypothalamus, are linked to the lifespan extension.

      Thanks for the reminding. 17α-estradiol was reported to extend lifespan in male rats similar to male mice (PMID: 33289482). We have added the valuable reference to introduction in the new version.  

      (3) The effect of 17α-estradiol on non-neuronal cells such as microglia and astrocytes is not well-described (Figure 1). Previous studies demonstrated that 17α-estradiol reduces microgliosis and astrogliosis in the hypothalamus of aged male mice. Current data suggest that the proportion of oligo, and microglia were increased by the drug treatment, while the proportions of astrocytes were decreased. These data might suggest possible species differences, differences in the treatment regimen, or differences in drug efficiency. This has to be discussed.

      We have reviewed reports describing changes in cell numbers following 17α-estradiol treatment in the brain, using the keywords "17α-estradiol," "17alpha-estradiol," and "microglia" or "astrocyte." Only a limited amount of data was obtained. We found one article indicating that 17α-estradiol treatment in Tg (AβPP(swe)/PS1(ΔE9)) model mice resulted in a decreased microglial cell number compared to the placebo (AβPP(swe)/PS1(ΔE9) mice), but this change was not significant when compared to the non-transgenic control (PMID: 21157032). The transgenic AβPP(swe)/PS1(ΔE9) mouse model may differ from our wild-type aging rat model in this context.

      Moreover, the calculation of cell numbers was based on visual observation under a microscope across several brain tissue slices. This traditional method often yields controversial results. For example, oligodendrocytes in the corpus callosum, fornix, and spinal cord have been reported to be 20-40% more numerous in males than in females based on microscopic observations (PMID: 16452667). In contrast, another study found no significant difference in the number of oligodendrocytes between sexes when using immunohistochemistry staining (PMID: 18709647). Such discrepancies arising from traditional observational methods are inevitable.

      We believe the data presented in this article are reliable because the cell number and cell ratio data were derived from high-throughput cell counting of the entire hypothalamus using single-cell suspension and droplet wrapping (10x Genomics).

      (4) A more detailed analysis of glial cell types within the hypothalamus in response to drugs should be provided.

      We provided more enrichment analysis data of differentially expressed genes between Y, O, and O.T in microglia and astrocytes in Figure 2—figure supplement 3. In this supplemental data, we found unlike that in neurons, Micro displayed lower levels of synapse-related cellular processes in O.T. compared to O.

      (5) The conclusion that CRH neurons are going into senescence is not clearly supported by the data. A more detailed analysis of the hypothalamus such as histological examination to assess cellular senescence markers in CRH neurons, is needed to support this claim.

      We also noted the inappropriate claim and have changed "senescent phenotype" to "stressed phenotype" and "abnormal phenotype" in both the abstract and results sections. The stressed phenotype could be induced by heightened functional activity in the cells, potentially indicating higher cellular activity. The GnRH and CRH neurons discussed in this paper may represent such a case, as illustrated by the observed high serum GnRH, testosterone, and cortisol levels. This revision suggestion is highly valuable and constructive for our understanding of the unique physiological characteristics revealed by these data.

      Reviewer #2 (Public Review):

      Summary:

      Li et al. investigated the potential anti-ageing role of 17α-Estradiol on the hypothalamus of aged rats. To achieve this, they employed a very sophisticated method for single-cell genomic analysis that allowed them to analyze effects on various groups of neurons and non-neuronal cells. They were able to sub-categorize neurons according to their capacity to produce specific neurotransmitters, receptors, or hormones. They found that 17α-Estradiol treatment led to an improvement in several factors related to metabolism and synaptic transmission by bringing the expression levels of many of the genes of these pathways closer or to the same levels as those of young rats, reversing the ageing effect. Interestingly, among all neuronal groups, the proportion of Oxytocin-expressing neurons seems to be the one most significantly changing after treatment with 17α-Estradiol, suggesting an important role of these neurons in mediating its anti-ageing effects. This was also supported by an increase in circulating levels of oxytocin. It was also found that gene expression of corticotropin-releasing hormone neurons was significantly impacted by 17α-Estradiol even though it was not different between aged and young rats, suggesting that these neurons could be responsible for side effects related to this treatment. This article revealed some potential targets that should be further investigated in future studies regarding the role of 17α-Estradiol treatment in aged males.

      Strengths:

      (1) Single-nucleus mRNA sequencing is a very powerful method for gene expression analysis and clustering. The supervised clustering of neurons was very helpful in revealing otherwise invisible differences between neuronal groups and helped identify specific neuronal populations as targets.

      Thanks.

      (2) There is a variety of functions used that allow the differential analysis of a very complex type of data. This led to a better comparison between the different groups on many levels.

      Thanks.

      (3) There were some physiological parameters measured such as circulating hormone levels that helped the interpretation of the effects of the changes in hypothalamic gene expression

      Thanks.

      Weaknesses

      (1) One main control group is missing from the study, the young males treated with 17α-Estradiol.

      Given that the treatment period lasts six months, which extends beyond the young male rats' age range, we aimed to investigate the perturbation of 17α-Estradiol on the normal aging process. Including data from young males could potentially obscure the treatment's effects in aged males due to age effects, though similar effects between young and aged animals may exist. Long-term treatment of hormone may exert more developmental effects on the young than the old. Consequently, we decided to exclude this group from our initial sample design. We apologize for this omission.

      (2) Even though the technical approach is a sophisticated one, analyzing the whole rat hypothalamus instead of specific nuclei or subregions makes the study weaker.

      The precise targets of 17α-Estradiol within the hypothalamus remain unresolved. Selecting a specific nucleus for study is challenging. The supervised clustering method described in this manuscript allows us to identify the more sensitive neuron subtypes influenced by 17α-Estradiol and aging across the entire hypothalamus, without the need to isolate specific nuclei in a disturbed hypothalamic environment.

      (3) Although the authors claim to have several findings, the data fail to support these claims. You may mean the claim as the senescent phenotype in Crh neuron induced by 17a-estradiol.

      Thanks. We have changed the "senescent phenotype" to "stressed phenotype" in the abstract and results to avoid such claim. The stressed phenotype may be induced by heightened functional activity in the cells, potentially indicating higher cellular activity.

      (4) The study is about improving ageing but no physiological data from the study demonstrated such a claim with the exception of the testes histology which was not properly analyzed and was not even significantly different between the groups.

      The primary objective of this study is to elucidate the effects of 17α-Estradiol on the endocrine system in the aging hypothalamus; exploring anti-aging effects is not the main focus. From the characteristics of the aging hypothalamus, we know that down-regulated GnRH and testosterone levels, along with elevated mTOR signaling, are indicators of aging in these organs from previous publications (PMID: 37886966, PMID: 37048056, PMID: 22884327). The contrasting signaling networks related to metabolism and synaptic processes significantly differentiate young and aging hypothalami, and 17α-Estradiol helps rebalance these networks, suggesting its potential anti-aging effects.

      (5) Overall, the study remains descriptive with no physiological data to demonstrate that any of the effects on hypothalamic gene expression are related to metabolic, synaptic, or other functions.

      The study focuses on investigating cellular responses and endocrine changes in the aging hypothalamus induced by 17α-estradiol, utilizing single-nucleus RNA sequencing (snRNA-seq) and a novel data mining methodology to analyze various neuron subtypes. It is important to note that this study does not mainly aim to explore the anti-aging effects. Consequently, we have revised the claim in the abstract from “the effects of 17α-estradiol in anti-aging in neurons” to “the effects of 17α-estradiol on aging neurons.” We observed that the lower overall metabolism and increased expression levels of cellular processes in the synapses align with findings previously reported regarding 17α-estradiol. To address the lack of physiological data and the challenges in measuring multiple endocrine factors due to their volatile nature, we employed several bidirectional Mendelian analyses of various genome-wide association study (GWAS) data related to these serum endocrine factors to identify their mutual causal effects.

      Reviewing Editor Comment:

      Based on the Public Reviews and Recommendations for Authors, the Reviewers strongly recommend that revisions include an experimental demonstration of the physiological effects of the treatment on ageing in rats as well as the CRH-senescence link. Additional analysis of the glia would greatly strengthen the study, as would inclusion of females and young male controls. The important point was also raised that the work linking 17a-estradiol was performed in mice, and the link with lifespan in rats is not known. Discussion of this point is recommended.

      We thank the reviewers for their constructive feedback. Regarding the recommendations in the Public Reviews and Recommendations for Authors:

      a)  Physiological effects & CRH-senescence link:

      We acknowledge that 17α-estradiol has been reported to extend lifespan in male rats, consistent with findings in male mice (PMID: 33289482). This point has now been noted in the Introduction. We regret that further experimental validation of the treatment's physiological effects on aging in rats was beyond the scope of this study.

      b) Phenotype terminology:

      In response to concerns about the "senescent" characterization of CRH neurons, we have revised this terminology to "stressed phenotype" throughout the abstract and results. While we were unable to conduct additional experiments to confirm senescence markers, this revised description better reflects the heightened cellular activity observed (as evidenced by elevated serum GnRH and testosterone levels), without implying confirmed senescence.

      c) Glial cell analysis:

      To address questions about glial cell function during treatment, we have added new enrichment analysis data of differentially expressed genes in microglia and astrocytes from young (Y), old (O), and old treated (O.T) groups in Figure 2—figure supplement 3. This analysis reveals that microglia exhibit contrasting synaptic-related cellular processes compared to total neurons.

      d) Female and young controls:

      We sincerely apologize for the absence of female subjects and young male controls in the current study. The reviewers' suggestion to examine the male-specific effects of 17α-estradiol using female controls represents an excellent direction for future research, which we plan to pursue in upcoming studies.

      Reviewer #2 (Recommendations For The Authors):

      General comments:

      (1) The manuscript is very hard to read. Proofreading and editing by software or a professional seems necessary. The words "enhanced", "extensive" etc. are not always used in the right way.

      Thanks for the suggestion. We have revised the proofreading and editing. The words "enhanced" and "extensive" were also revised in most sentences.

      (2) The numbers of animals and samples are not well explained. Is it 9 rats overall or per group? If there are 8 testes samples per group, should we assume that there were 4 rats per group? The pooling of the hypothalamic how was it done? Were all the hypothalamic from each group pooled together? A small table with the animals per group and the samples would help.

      We appreciate your reminder regarding the initial mistake in our manuscript preparation. In the preliminary submission, we reported 9 rats based solely on sequencing data and data mining. The revised version (v1) now includes additional experimental data, with an effective total of 12 animals (4 per group). Unfortunately, we overlooked updating this information in the v1 submission. We have since added detailed information in the Materials and Methods sections: Animals, Treatment and Tissues, and snRNA-seq Data Processing, Batch Effect Correction, and Cell Subset Annotation.

      (3) The Clustering is wrong. There are genes in there that do not fall into any of the 3 categories: Neurotransmitters, Receptors, Hormones.

      We acknowledge the error in gene clustering and have implemented the following corrections:

      (a) The description has been updated to state: 'Vast majority of these subtypes were clustered by neuropeptides, hormones, and their receptors among all neurons.'

      (b) Genes not belonging to these three categories have been substantially removed.

      (c) The neuropeptide category (now including several growth hormones) has been expanded to 104 genes, while their corresponding receptors (including several sex hormone receptors) now comprise 105 genes.

      (4) The coloring of groups in the graphs is inconsistent. It must be more homogeneous to make it easier to identify.

      We have changed the colors of groups in Fig. 1D to make the color of cell clusters consistent in Fig. 1A-D.

      (5) The groups c1-c4 are not well explained. How did the authors come up with these?

      We have added more descriptions of c1-c4 in materials and methods in the new version.

      (6) In most cases it's not clear if the authors are talking about cell numbers that express a certain mRNA, the level of expression of a certain mRNA, or both. They need to do a better job using more precise descriptions instead of using general terms such as "signatures", "expression profiles", "affected neurons" etc. It is very hard to understand if the number of neurons is compared between the groups or the gene expression.

      We have changed the "signatures" to "gene signatures" to make it more accurate in meaning. The "affected neurons" were also changed to "sensitive neurons". But sorry that we were not able to find better alternatives to the "expression profiles".

      (7) Sometimes there are claims made without justification or a reference. For example, the claim about the senescence of CRH neurons due to the upregulation of mitochondrial genes and downregulation of adherence junction genes (lines 326-328) should be supported by a reference or own findings.

      The "senescence" here is not appropriate. We have changed it to "stressed phenotype" or "aberrant changes" in abstract and results.

      (8) Young males treated with Estradiol as a control group is necessary and it is missing.

      Your suggestion is appreciated; however, the treatment duration for aged mice (O.T) was set at 6 months, while the young mice were only 4 months old. This disparity makes it challenging to align treatment timelines for the young animals. The primary aim of this study is to investigate the perturbation of 17α-estradiol on the aging process, and any distinct effects due to age effect observed in young males might complicate our understanding of its role in aged males, though similar endocrine effects may exist in the young animals. Long-term treatment of hormone may exert more developmental effects on the young than the old. Therefore, we made the decision to exclude the young samples in our initial study design. We apologize for any confusion this may have caused.

      Specific Comments:

      Line 28: "elevated stresses and decreased synaptic activity": Please make this clearer. Can't claim changes in synaptic activity by gene expression.

      We have changed it to "the expression level of pathways involved in synapse"

      Line 32: "increased Oxytocin": serum Oxytocin.

      We have added the “serum”.

      Line 52 - 54: Any studies from rats?

      Thanks. In rats there is also reported that 17α-estradiol has similar metabolic roles as that in mice (PMID: 33289482) and we have added it to the refences. It’s very useful for this manuscript.

      Line 62 - 65: It wasn't investigated thoroughly in this paper so why was it suggested in the introduction?

      We have deleted this sentence as being suggested.

      Line 70: "synaptic activity" Same as line 28.

      We have changed it to "pathways involved in synaptic activity".

      Line 79: Why were aged rats caged alone and young by two? Could that introduce hypothalamic gene expression effects?

      The young males were bred together in peace. But the aged males will fight and should be kept alone.

      Lines 78, 99, 109-110: It is not clear how many animals per group were used and how many samples per group were used separately and/or grouped. Please be more specific.

      We have added these information to Materials and methods/Animals, treatment and tissues and Materials and methods/snRNA-seq data processing, batch effect correction, and cell subset annotation.

      Line 205: "in O" please add "versus young.".

      We have changed accordingly.

      Line 207: replace "were" with "was"

      We have alternatively changed the "proportion" to "proportions".

      Line 208: replace "that" with "compared to" and after "in O.T." add "compared to?"

      We have changed accordingly.

      Line 223: "O.T." compared to what? Figure?

      We have changed it accordingly.

      Line 227: Figure?

      We have added (Figure 1E) accordingly.

      Line 229: "synaptic activity" Same as line 28.

      We have revised it.

      Line 235: "synaptic activity" and "neuropeptide secretion" Same as line 28.

      We have revised it.

      Line 256:" interfered" please revise.

      We changed to "exerted".

      Line 263: "on the contrary" please revise.

      We have changed "on the contrary" to "opposite".

      Line 270: "conversed" did you mean "conserved"?

      We have changed "conversed" to "inversed".

      Line 296-298: Please explain. Why would these be side effects?

      It’s hard to explain, therefore, we deleted the words "side effects".

      Line 308: "synaptic activity" Same as line 28.

      We have changed it to "expression levels of synapse-related cellular processes".

      Line 314: "and sex hormone secretion and signaling"Isn't this expected?

      Yes, it is expected. We have added it to the sentence "and, as expected, sex hormone secretion and signaling".

      Line 325-328: Why is this senescence? Reference?

      We have added “potent” to it.

      Line 360-361: This doesn't show elevated synaptic activity.

      "elevated synaptic activity" was changed to "The elevated expression of synapse-related pathways"

      Line 363-364: "Unfortunately" is not a scientific expression and show bias.

      We have changed it to "Notably".

      Line 376: Similar as above.

      Yes, we have change it to "in contrast".

      Lines 382-385: This is speculation. Please move to discussion.

      Sorry for that. We think the causal effects derived from MR result is evidence. As such, we have not changed it.

      Line 389: Please revise "hormone expressing".

      We have changed it accordingly.

      Line 401: Isn't this effect expected due to feedback inhibition of the biochemical pathway? Please comment.

      The binding capability of 17alpha-estradiol to estrogen receptors and its role in transcriptional activation remain core questions surrounded by controversy. Earlier studies suggest that 17alpha-estradiol exhibits at least 200 times less activity than 17beta-estradiol (PMID: 2249627, PMID: 16024755). However, recent data indicate that 17alpha-estradiol shows comparable genomic binding and transcriptional activation through estrogen receptor α (Esr1) to that of 17beta-estradiol (PMID: 33289482). Additionally, there is evidence that 17alpha-estradiol has anti-estrogenic effects in rats (PMID: 16042770). These findings imply possible feedback inhibition via estrogen receptors. Furthermore, 17alpha-estradiol likely differs from 17beta-estradiol due to its unique metabolic consequences and its potential to slow aging in males, an effect not attributed to 17beta-estradiol. For instance, neurons are also targets of 17alpha-estradiol, with Esr1 not being the sole target (PMID: 38776045). Intriguingly, neurons expressing Ar and Esr1 ranked among the top 20 most perturbed receptor subtypes during aging (O vs Y), but were no longer ranked in this group following treatment (O.T vs Y and O.T vs O comparisons). This indicates that 17α-estradiol administration attenuated age-associated perturbation in these neuronal subtypes, which may be a consequence of potential feedback (Figure 3D). Nevertheless, the precise effective targets of 17alpha-estradiol are still unresolved.

      Line 409: This conclusion cannot be made because the effect is not statistically significant. Can say "trend" etc.

      Thanks for the recommendation. We have added "potential" in front of the conclusion.

      Line 426: "suggesting" please revise.

      sorry, it’s a verb.

      Lines 426-428: This is speculation. Please move to discussion.

      The elevated GnRH levels in O.T., observed through EIA analysis, suggest a deduction regarding the direct causal effects of 17alpha-estradiol on various endocrine factors related to feeding, energy homeostasis, reproduction, osmotic regulation, stress response, and neuronal plasticity through MR analysis. Thus, we have not amended our position. We apologize for any confusion.

      Lines 431-432: improved compared to what?

      The statement have been revised as " The most striking role of 17α-estradiol treatment revealed in this study showed that HPG axis was substantially improved in the levels of serum Gnrh and testosterone".

      Line 435: " Estrogen Receptor Antagonists". Please revise.

      Thanks for the recommendation. We have changed it to "estrogen receptor antagonists".

      Line 438" "Secrete". Please revise

      Sorry, it is "secret".

      Lines 439-449: None of this has been demonstrated. Please remove these conclusions.

      We appreciate the reviewer's scrutiny regarding lines 439-449. While these statements should not be interpreted as definitive conclusions from our current data, we propose they serve as clinically relevant discussion points worthy of exploration. Our findings demonstrate 17α-estradiol's role in modulating testosterone levels in aged males. This mechanistic insight warrants consideration of its therapeutic potential for age-related hypogonadism - a hypothesis we believe merits discussion given the compound's specific endocrine effects.

      Lines 450-457: No females were included in this study. Why? Also, why is this discussed? It is relevant but doesn't belong in this manuscript since it was not studied here.

      Testosterone levels are crucial for male health, while estradiol levels are essential for the health and fertility of females. Previous studies have demonstrated that 17α-estradiol does not contribute to lifespan extension in females. Given the effects of 17α-estradiol on males—specifically, its role in promoting testosterone and reducing estradiol levels—we believe it is important to discuss the potential sex-biased effects of 17α-estradiol, as this could inform future investigations. We have refined this section to clarify that these points represent mechanistic hypotheses derived from our male data and existing literature, not conclusions about unstudied female physiology. This framing maintains the discussion's scientific value while respecting the study's scope.

      Lines 458-459: This was not demonstrated in this article. Please remove.

      We have restricted the claim to "expression level of energy metabolism in hypothalamic neurons".

      Line 464: "Promoted lifespan extension" Not demonstrated. Please remove.

      At the end of the sentence it was revised as "which may be a contributing factor in promoting lifespan extension".

      Line 466: "Showed" No.

      The whole sentence was deleted in the new version.

      Line 483: "the sex-based effects". Not studied here.

      Since the changes in testosterone levels are significant in this dataset and this hormone has a sex-biased nature, we find it worthwhile to suggest this as a topic for future investigation. We have added "which needs further verification in the future" at the end of this sentence.

    1. Reviewer #1 (Public review):

      Summary:

      The authors introduce a densely-sampled dataset where 6 participants viewed images and sentence descriptions derived from the MS Coco database over the course of 10 scanning sessions. The authors further showcase how image and sentence decoders can be used to predict which images or descriptions were seen, using pairwise decoding across a set of 120 test images. The authors find decodable information widely distributed across the brain, with a left-lateralized focus. The results further showed that modality-agnostic models generally outperformed modality-specific models, and that data based on captions was not explained better by caption-based models but by modality-agnostic models. Finally, the authors decoded imagined scenes.

      Strengths:

      (1) The dataset presents a potentially very valuable resource for investigating visual and semantic representations and their interplay.

      (2) The introduction and discussion are very well written in the context of trying to understand the nature of multimodal representations and present a comprehensive and very useful review of the current literature on the topic.

      Weaknesses:

      (1) The paper is framed as presenting a dataset, yet most of it revolves around the presentation of findings in relation to what the authors call modality-agnostic representations, and in part around mental imagery. This makes it very difficult to assess the manuscript, whether the authors have achieved their aims, and whether the results support the conclusions.

      (2) While the authors have presented a potential use case for such a dataset, there is currently far too little detail regarding data quality metrics expected from the introduction of similar datasets, including the absence of head-motion estimates, quality of intersession alignment, or noise ceilings of all individuals.

      (3) The exact methods and statistical analyses used are still opaque, making it hard for a reader to understand how the authors achieved their results. More detail in the manuscript would be helpful, specifically regarding the exact statistical procedures, what tests were performed across, or how data were pooled across participants.

      (4) Many findings (e.g., Figure 6) are still qualitative but could be supported by quantitative measures.

      (5) Results are significant in regions that typically lack responses to visual stimuli, indicating potential bias in the classifier. This is relevant for the interpretation of the findings. A classification approach less sensitive to outliers (e.g., 70-way classification) could avoid this issue. Given the extreme collinearity of the experimental design, regressors in close temporal proximity will be highly similar, which could lead to leakage effects.

      (6) The manuscript currently lacks a limitations section, specifically regarding the design of the experiment. This involves the use of the overly homogenous dataset Coco, which invites overfitting, the mixing of sentence descriptions and visual images, which invites imagery of previously seen content, and the use of a 1-back task, which can lead to carry-over effects to the subsequent trial.

      (7) I would urge the authors to clarify whether the primary aim is the introduction of a dataset and showing the use of it, or whether it is the set of results presented. This includes the title of this manuscript. While the decoding approach is very interesting and potentially very valuable, I believe that the results in the current form are rather descriptive, and I'm wondering what specifically they add beyond what is known from other related work. This includes imagery-related results. This is completely fine! It just highlights that a stronger framing as a dataset is probably advantageous for improving the significance of this work.

    2. Reviewer #2 (Public review):

      Summary:

      This study introduces SemReps-8K, a large multimodal fMRI dataset collected while subjects viewed natural images and matched captions, and performed mental imagery based on textual cues. The authors aim to train modality-agnostic decoders--models that can predict neural representations independently of the input modality - and use these models to identify brain regions containing modality-agnostic information. They find that such decoders perform comparably or better than modality-specific decoders and generalize to imagery trials.

      Strengths:

      (1) The dataset is a substantial and well-controlled contribution, with >8,000 image-caption trials per subject and careful matching of stimuli across modalities - an essential resource for testing theories of abstract and amodal representation.

      (2) The authors systematically compare unimodal, multimodal, and cross-modal decoders using a wide range of deep learning models, demonstrating thoughtful experimental design and thorough benchmarking.

      (3) Their decoding pipeline is rigorous, with informative performance metrics and whole-brain searchlight analyses, offering valuable insights into the cortical distribution of shared representations.

      (4) Extension to mental imagery decoding is a strong addition, aligning with theoretical predictions about the overlap between perception and imagery.

      Weaknesses:

      While the decoding results are robust, several critical limitations prevent the current findings from conclusively demonstrating truly modality-agnostic representations:

      (1) Shared decoding ≠ abstraction: Successful decoding across modalities does not necessarily imply abstraction or modality-agnostic coding. Participants may engage in modality-specific processes (e.g., visual imagery when reading, inner speech when viewing images) that produce overlapping neural patterns. The analyses do not clearly disambiguate shared representational structure from genuinely modality-independent representations. Furthermore, in Figure 5, the modality-agnostic encoder did not perform better than the modality-specific decoder trained on images (in decoding images), but outperformed the modality-specific decoder trained on captions (in decoding captions). This asymmetry contradicts the premise of a truly "modality-agnostic" encoder. Additionally, given the similar performance between modality-agnostic decoders based on multimodal versus unimodal features, it remains unclear why neural representations did not preferentially align with multimodal features if they were truly modality-independent.

      (2) The current analysis cannot definitively conclude that the decoder itself is modality-agnostic, making "Qualitative Decoding Results" difficult to interpret in this context. This section currently provides illustrative examples, but lacks systematic quantitative analyses.

      (3) The use of mental imagery as evidence for modality-agnostic decoding is problematic. Imagery involves subjective, variable experiences and likely draws on semantic and perceptual networks in flexible ways. Strong decoding in imagery trials could reflect semantic overlap or task strategies rather than evidence of abstraction.

      The manuscript presents a methodologically sophisticated and timely investigation into shared neural representations across modalities. However, the current evidence does not clearly distinguish between shared semantics, overlapping unimodal processes, and true modality-independent representations. A more cautious interpretation is warranted. Nonetheless, the dataset and methodological framework represent a valuable resource for the field.

    3. Reviewer #3 (Public review):

      Summary:

      The authors recorded brain responses while participants viewed images and captions. The images and captions were taken from the COCO dataset, so each image has a corresponding caption, and each caption has a corresponding image. This enabled the authors to extract features from either the presented stimulus or the corresponding stimulus in the other modality. The authors trained linear decoders to take brain responses and predict stimulus features. "Modality-specific" decoders were trained on brain responses to either images or captions, while "modality-agnostic" decoders were trained on brain responses to both stimulus modalities. The decoders were evaluated on brain responses while the participants viewed and imagined new stimuli, and prediction performance was quantified using pairwise accuracy. The authors reported the following results:

      (1) Decoders trained on brain responses to both images and captions can predict new brain responses to either modality.

      (2) Decoders trained on brain responses to both images and captions outperform decoders trained on brain responses to a single modality.

      (3) Many cortical regions represent the same concepts in vision and language.

      (4) Decoders trained on brain responses to both images and captions can decode brain responses to imagined scenes.

      Strengths:

      This is an interesting study that addresses important questions about modality-agnostic representations. Previous work has shown that decoders trained on brain responses to one modality can be used to decode brain responses to another modality. The authors build on these findings by collecting a new multimodal dataset and training decoders on brain responses to both modalities.

      To my knowledge, SemReps-8K is the first dataset of brain responses to vision and language where each stimulus item has a corresponding stimulus item in the other modality. This means that brain responses to a stimulus item can be modeled using visual features of the image, linguistic features of the caption, or multimodal features derived from both the image and the caption. The authors also employed a multimodal one-back matching task, which forces the participants to activate modality-agnostic representations. Overall, SemReps-8K is a valuable resource that will help researchers answer more questions about modality-agnostic representations.

      The analyses are also very comprehensive. The authors trained decoders on brain responses to images, captions, and both modalities, and they tested the decoders on brain responses to images, captions, and imagined scenes. They extracted stimulus features using a range of visual, linguistic, and multimodal models. The modeling framework appears rigorous, and the results offer new insights into the relationship between vision, language, and imagery. In particular, the authors found that decoders trained on brain responses to both images and captions were more effective at decoding brain responses to imagined scenes than decoders trained on brain responses to either modality in isolation. The authors also found that imagined scenes can be decoded from a broad network of cortical regions.

      Weaknesses:

      The characterization of "modality-agnostic" and "modality-specific" decoders seems a bit contradictory. There are three major choices when fitting a decoder: the modality of the training stimuli, the modality of the testing stimuli, and the model used to extract stimulus features. However, the authors characterize their decoders based on only the first choice-"modality-specific" decoders were trained on brain responses to either images or captions, while "modality-agnostic" decoders were trained on brain responses to both stimulus modalities. I think that this leads to some instances where the conclusions are inconsistent with the methods and results.

      First, the authors suggest that "modality-specific decoders are not explicitly encouraged to pick up on modality-agnostic features during training" (line 137) while "modality-agnostic decoders may be more likely to leverage representations that are modality-agnostic" (line 140). However, whether a decoder is required to learn modality-agnostic representations depends on both the training responses and the stimulus features. Consider the case where the stimuli are represented using linguistic features of the captions. When you train a "modality-specific" decoder on image responses, the decoder is forced to rely on modality-agnostic information that is shared between the image responses and the caption features. On the other hand, when you train a "modality-agnostic" decoder on both image responses and caption responses, the decoder has access to the modality-specific information that is shared by the caption responses and the caption features, so it is not explicitly required to learn modality-agnostic features. As a result, while the authors show that "modality-agnostic" decoders outperform "modality-specific" decoders in most conditions, I am not convinced that this is because they are forced to learn more modality-agnostic features.

      Second, the authors claim that "modality-specific decoders can be applied only in the modality that they were trained on, while "modality-agnostic decoders can be applied to decode stimuli from multiple modalities, even without knowing a priori the modality the stimulus was presented in" (line 47). While "modality-agnostic" decoders do outperform "modality-specific" decoders in the cross-modality conditions, it is important to note that "modality-specific" decoders still perform better than expected by chance (figure 5). It is also important to note that knowing about the input modality still improves decoding performance even for "modality-agnostic" decoders, since it determines the optimal feature space-it is better to decode brain responses to images using decoders trained on image features, and it is better to decode brain responses to captions using decoders trained on caption features.

    1. Reviewer #2 (Public review):

      Summary:

      This study develops a joint epidemiological and population genetic model to infer variant-specific effective reproduction numbers Rt and growth advantages of SARS-CoV-2 variants using US case counts and sequence data (Jan 2021-Mar 2022). For this, they use the commonly used renewal equation framework, observation models (negative binomial with zero inflation and Dirichlet-multinomial likelihoods, both to account for overdispersion). For the parameterization of Rt, again, they used a classic cubic spline basis expansion. Additionally, they use Bayesian Inference, specifically SVI. I was reassured to see the sensitivity analysis on the generation time to check effects on Rt.

      This is an incredibly robust study design. Integrating case and sequence data enables estimation of both absolute and relative variant fitness, overcoming limitations of frequency-only or case-only models. This reminds me of https://www.medrxiv.org/content/10.1101/2023.01.02.23284123v4.full

      I also really appreciated the flexible and interpretable parameterization of the renewal equations with splines. But I may be biased since I really like splines!

      The approach is justified, however, it has some big limitations. Specifically, there are some notable weaknesses, that I detail below.

      (1) The model does not account for demographic stochasticity or transmission overdispersion (superspreading), which are known to affect SARS-CoV-2 dynamics and can bias Rt, especially in low incidence or early introduction phases.

      (2) While the authors explore the sensitivity of generation time, the reliance on fixed generation time parameters (with some adjustments for Delta/Omicron) may still bias results

      (3) There is no explicit adjustment for population immunity, which limits the ability to disentangle intrinsic variant fitness (even though the model allows for inclusion of covariates - this to me is one of two major flaws in the study.

      (4) The second major flaw in my opinion is that there is no hierarchical pooling across states - each state is modeled independently. A hierarchical Bayesian model could borrow strength across states, improving estimates for states with sparse data and enabling more robust inference of shared variant effects.

      I would strongly recommend the following things in order of priority, where the first two points I consider critical.

      (1) Implement a hierarchical model for variant growth advantages and Rt across states.

      (2) Include time-varying covariates for vaccination rates, prior infection, and non-pharmaceutical interventions directly. This would help disentangle intrinsic variant transmissibility from changes in population susceptibility and behavior.

      (3) Extend the renewal model to a stochastic or branching process framework that explicitly models overdispersed transmission.

      (4) It would be good to allow for multiple seeding events per variant and per state. This can be informed by phylogeography in a minimum effort way and would improve the accuracy of Rt.

      (5) By now, I don't think it will be a surprise that addressing sampling bias is standard, reweighting sequence data or comparing results with independent surveillance data to assess the impact of non-representative sequencing.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      This is a well-designed and very interesting study examining the impact of imprecise feedback on outcomes in decision-making. I think this is an important addition to the literature, and the results here, which provide a computational account of several decision-making biases, are insightful and interesting.

      We thank the reviewer for highlighting the strengths of this work.

      I do not believe I have substantive concerns related to the actual results presented; my concerns are more related to the framing of some of the work. My main concern is regarding the assertion that the results prove that non-normative and non-Bayesian learning is taking place. I agree with the authors that their results demonstrate that people will make decisions in ways that demonstrate deviations from what would be optimal for maximizing reward in their task under a strict application of Bayes' rule. I also agree that they have built reinforcement learning models that do a good job of accounting for the observed behavior. However, the Bayesian models included are rather simple, per the author's descriptions, applications of Bayes' rule with either fixed or learned credibility for the feedback agents. In contrast, several versions of the RL models are used, each modified to account for different possible biases. However, more complex Bayes-based models exist, notably active inference, but even the hierarchical Gaussian filter. These formalisms are able to accommodate more complex behavior, such as affect and habits, which might make them more competitive with RL models. I think it is entirely fair to say that these results demonstrate deviations from an idealized and strict Bayesian context; however, the equivalence here of Bayesian and normative is, I think, misleading or at least requires better justification/explanation. This is because a great deal of work has been done to show that Bayes optimal models can generate behavior or other outcomes that are clearly not optimal to an observer within a given context (consider hallucinations for example), but which make sense in the context of how the model is constructed as well as the priors and desired states the model is given.

      As such, I would recommend that the language be adjusted to carefully define what is meant by normative and Bayesian and to recognize that work that is clearly Bayesian could potentially still be competitive with RL models if implemented to model this task. An even better approach would be to directly use one of these more complex modelling approaches, such as active inference, as the comparator to the RL models, though I would understand if the authors would want this to be a subject for future work.

      We thank the reviewer for raising this crucial and insightful point regarding the framing of our results and the definitions of 'normative' and 'Bayesian' learning. Our primary aim in this work was to characterize specific behavioral signatures that demonstrate deviations from predictions generated by a strict, idealized Bayesian framework when learning from disinformation (which we term “biases”). We deliberately employed relatively simple Bayesian models as benchmarks to highlight these specific biases. We fully agree that more sophisticated Bayes-based models (as mentioned by the reviewer, or others) could potentially offer alternative mechanistic explanations for participant behavior. However, we currently do not have a strong notion about which Bayesian models can encompass our findings, and hence, we leave this important question for future work.

      To enhance clarity within the current manuscript we now avoided the use of the term “normative” to refer to our Bayesian models, using the term “ideal” instead. We also define more clearly what exactly we mean by that notion when the idea model is described:

      “This model is based on an idealized assumptions that during the feedback stage of each trial, the value of the chosen bandit is updated (based on feedback valence and credibility) according to Bayes rule reflecting perfect adherence to the instructed task structure (i.e., how true outcomes and feedback are generated).”

      Moreover, we have added a few sentences in the discussion commenting on how more complex Bayesian models might account for our empirical findings:

      “However, as hypothesized, when facing potential disinformation, we also find that individuals exhibit several important biases i.e., deviations from strictly idealized Bayesian strategies. Future studies should explore if and under what assumptions, about the task’s generative structure and/or learner’s priors and objectives, more complex Bayesian models (e.g., active inference (58)) might account for our empirical findings.”

      Abstract:

      The abstract is lacking in some detail about the experiments done, but this may be a limitation of the required word count. If word count is not an issue, I would recommend adding details of the experiments done and the results.

      We thank the reviewer for their valuable suggestion. We have now included more details about the experiment in the abstract:

      “In two experiments, participants completed a two-armed bandit task, where they repeatedly chose between two lotteries and received outcome-feedback from sources of varying credibility, who occasionally disseminated disinformation by lying about true choice outcome (e.g., reporting non reward when a reward was truly earned or vice versa).”

      One comment is that there is an appeal to normative learning patterns, but this suggests that learning patterns have a fixed optimal nature, which may not be true in cases where the purpose of the learning (e.g. to confirm the feeling of safety of being in an in-group) may not be about learning accurately to maximize reward. This can be accommodated in a Bayesian framework by modelling priors and desired outcomes. As such, the central premise that biased learning is inherently non-normative or non-Bayesian, I think, would require more justification. This is true in the introduction as well.

      Introduction:

      As noted above, the conceptualization of Bayesian learning being equivalent to normative learning, I think requires further justification. Bayesian belief updating can be biased and non-optimal from an observer perspective, while being optimal within the agent doing the updating if the priors/desired outcomes are set up to advantage these "non-optimal" modes of decision making.

      We appreciate the reviewer's thoughtful comment regarding the conceptualization of "normative" and "Bayesian" learning. We fully agree that the definition of "normative" is nuanced and can indeed depend on whether one considers reward-maximization or the underlying principles of belief updating. As explained above we now restrict our presentation to deviations from “ideal Bayes” learning patterns and we acknowledge the reviewer’s concern in a caveat in our discussion.

      Results:

      I wonder why the agent was presented before the choice, since the agent is only relevant to the feedback after the choice is made. I wonder if that might have induced any false association between the agent identity and the choice itself. This is by no means a critical point, but it would be interesting to get the authors' thoughts.

      We thank the reviewer for raising this interesting point regarding the presentation of the agent before the choice. Our decision to present the agent at this stage was intentional, as our original experimental design aimed to explore the possible effects of "expected source credibility" on participants' choices (e.g., whether knowledge of feedback credibility will affect choice speed and accuracy). However, we found nothing that would be interesting to report.

      The finding that positive feedback increases learning is one that has been shown before and depends on valence, as the authors note. They expanded their reinforcement learning model to include valence, but they did not modify the Bayesian model in a similar manner. This lack of a valence or recency effect might also explain the failure of the Bayesian models in the preceding section, where the contrast effect is discussed. It is not unreasonable to imagine that if humans do employ Bayesian reasoning that this reasoning system has had parameters tuned based on the real world, where recency of information does matter; affect has also been shown to be incorporable into Bayesian information processing (see the work by Hesp on affective charge and the large body of work by Ryan Smith). It may be that the Bayesian models chosen here require further complexity to capture the situation, just like some of the biases required updates to the RL models. This complexity, rather than being arbitrary, may be well justified by decision-making in the real world.

      Thanks for these additional important ideas which speak more to the notion that more complex Bayesian frameworks may account for biases we report.

      The methods mention several symptom scales- it would be interesting to have the results of these and any interesting correlations noted. It is possible that some of the individual variability here could be related to these symptoms, which could introduce precision parameter changes in a Bayesian context and things like reward sensitivity changes in an RL context.

      We included these questionnaires for exploratory purposes, with the aim of generating informed hypotheses for future research into individual differences in learning. Given the preliminary nature of these analyses, we believe further research is required about this important topic.

      Discussion:

      (For discussion, not a specific comment on this paper): One wonders also about participants' beliefs about the experiment or the intent of the experimenters. I have often had participants tell me they were trying to "figure out" a task or find patterns even when this was not part of the experiment. This is not specific to this paper, but it may be relevant in the future to try and model participant beliefs about the experiment especially in the context of disinformation, when they might be primed to try and "figure things out".

      We thank the reviewer for this important recommendation. We agree and this point is included in our caveat (cited above) that future research should address what assumptions about the generative task structure can allow Bayesian models to account for our empirical patterns.

      As a general comment, in the active inference literature, there has been discussion of state-dependent actions, or "habits", which are learned in order to help agents more rapidly make decisions, based on previous learning. It is also possible that what is being observed is that these habits are at play, and that they represent the cognitive biases. This is likely especially true given, as the authors note, the high cognitive load of the task. It is true that this would mean that full-force Bayesian inference is not being used in each trial, or in each experience an agent might have in the world, but this is likely adaptive on the longer timescale of things, considering resource requirements. I think in this case you could argue that we have a departure from "normative" learning, but that is not necessarily a departure from any possible Bayesian framework, since these biases could potentially be modified by the agent or eschewed in favor of more expensive full-on Bayesian learning when warranted.<br /> Indeed, in their discussion on the strategy of amplifying credible news sources to drown out low-credibility sources, the authors hint at the possibility of longer-term strategies that may produce optimal outcomes in some contexts, but which were not necessarily appropriate to this task. As such, the performance on this task- and the consideration of true departure from Bayesian processing- should be considered in this wider context.

      Another thing to consider is that Bayesian inference is occurring, but that priors present going in produce the biases, or these biases arise from another source, for example, factoring in epistemic value over rewards when the actual reward is not large. This again would be covered under an active inference approach, depending on how the priors are tuned. Indeed, given the benefit of social cohesion in an evolutionary perspective, some of these "biases" may be the result of adaptation. For example, it might be better to amplify people's good qualities and minimize their bad qualities in order to make it easier to interact with them; this entails a cost (in this case, not adequately learning from feedback and potentially losing out sometimes), but may fulfill a greater imperative (improved cooperation on things that matter). Given the right priors/desired states, this could still be a Bayes-optimal inference at a social level and, as such, may be ingrained as a habit that requires effort to break at the individual level during a task such as this.

      We thank the reviewer for these insightful suggestions speaking further to the point about more complex Bayesian models.

      The authors note that this task does not relate to "emotional engagement" or "deep, identity-related issues". While I agree that this is likely mostly true, it is also possible that just being told one is being lied to might elicit an emotional response that could bias responses, even if this is a weak response.

      We agree with the reviewer that a task involving performance-based bonuses, and particularly one where participants are explicitly told they are being lied to, might elicit weak emotional response. However, our primary point is that the degree of these responses is expected to be substantially weaker than those typically observed in the broader disinformation literature, which frequently deals with highly salient political, social, or identity-related topics that inherently carry strong emotional and personal ties for participants, leading to much more pronounced affective engagement and potential biases. Our task deliberately avoids such issues thus minimizing the potential for significant emotion-driven biases. We have toned down the discussion accordingly:

      “This occurs even when the decision at hand entails minimal emotional engagement or pertinence to deep, identity-related, issues.”

      Reviewer #2 (Public review):

      This valuable paper studies the problem of learning from feedback given by sources of varying credibility. The solid combination of experiment and computational modeling helps to pin down properties of learning, although some ambiguity remains in the interpretation of results.

      Summary:

      This paper studies the problem of learning from feedback given by sources of varying credibility. Two banditstyle experiments are conducted in which feedback is provided with uncertainty, but from known sources. Bayesian benchmarks are provided to assess normative facets of learning, and alternative credit assignment models are fit for comparison. Some aspects of normativity appear, in addition to deviations such as asymmetric updating from positive and negative outcomes.

      Strengths:

      The paper tackles an important topic, with a relatively clean cognitive perspective. The construction of the experiment enables the use of computational modeling. This helps to pinpoint quantitatively the properties of learning and formally evaluate their impact and importance. The analyses are generally sensible, and parameter recovery analyses help to provide some confidence in the model estimation and comparison.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      (1) The approach in the paper overlaps somewhat with various papers, such as Diaconescu et al. (2014) and Schulz et al. (forthcoming), which also consider the Bayesian problem of learning and applying source credibility, in terms of theory and experiment. The authors should discuss how these papers are complementary, to better provide an integrative picture for readers.

      Diaconescu, A. O., Mathys, C., Weber, L. A., Daunizeau, J., Kasper, L., Lomakina, E. I., ... & Stephan, K. E. (2014). Inferring the intentions of others by hierarchical Bayesian learning. PLoS computational biology, 10(9), e1003810.

      Schulz, L., Schulz, E., Bhui, R., & Dayan, P. Mechanisms of Mistrust: A Bayesian Account of Misinformation Learning. https://doi.org/10.31234/osf.io/8egxh

      We thank the reviewers for pointing us to this relevant work. We have updated the introduction, mentioning these precedents in the literature and highlighting our specific contributions:

      “To address these questions, we adopt a novel approach within the disinformation literature by exploiting a Reinforcement Learning (RL) experimental framework (36). While RL has guided disinformation research in recent years (37–41), our approach is novel in using one of its most popular tasks: the “bandit task”.”

      We also explain in the discussion how these papers relate to the current study:

      “Unlike previous studies wherein participants had to infer source credibility from experience (30,37,72), we took an explicit-instruction approach, allowing us to precisely assess source-credibility impact on learning, without confounding it with errors in learning about the sources themselves. More broadly, our work connects with prior research on observational learning, which examined how individuals learn from the actions or advice of social partners (72–75). This body of work has demonstrated that individuals integrate learning from their private experiences with learning based on others’ actions or advice—whether by inferring the value others attribute to different options or by mimicking their behavior (57,76). However, our task differs significantly from traditional observational learning. Firstly, our feedback agents interpret outcomes rather than demonstrating or recommending actions (30,37,72).”

      (2) It isn't completely clear what the "cross-fitting" procedure accomplishes. Can this be discussed further?

      We thank the reviewer for requesting further clarification on the cross-fitting procedure. Our study utilizes two distinct model families: Bayesian models and CA models. The credit assignment parameters from the CA models can be treated as “data/behavioural features” corresponding to how choice feedback affects choice-propensities. The cross fitting-approach allows us in effect to examine whether these propensity features are predicted from our Bayesian models. To the extent they are not, we can conclude empirical behavior is “biased”.

      Thus, in our cross-fitting procedure we compare the CA model parameters extracted from participant data (empirical features) with those that would be expected if our Bayesian agents performed the task. Specifically, we first fit participant behavior with our Bayesian models, then simulate this model using the best-fitted parameters and fit those simulations with our CA models. This generates a set of CA parameters that would be predicted if participants behavior is reduced to a Bayesian account. By comparing these predicted Bayesian CA parameters with the actual CA parameters obtained from human participants, the cross-fitting procedure allows us to quantitatively demonstrate that the observed participant parameters are indeed statistically significant deviations from normative Bayesian processing. This provides a robust validation that the biases we identify are not artifacts of the CA model's structure but true departures from normative learning.

      We also note that Reviewer 3 suggested an intuitive way to think about the CA parameters—as analogous to logistic regression coefficients in a “sophisticated regression” of choice on (recencyweighted) choice-feedback. We find this suggestion potentially helpful for readers. Under this interpretation, the purpose of the cross-fitting method can be seen simply as estimating the regression coefficients that would be predicted by our Bayesian agents, and comparing those to the empirical coefficients.

      In our manuscript we now explain this issues more clearly by explaining how our model is analogous to a logistic regression:

      “The probability to choose a bandit (say A over B) in this family of models is a logistic function of the contrast choice-propensities between these two bandits. One interpretation of this model is as a “sophisticated” logistic regression, where the CA parameters take the role of “regression coefficients” corresponding to the change in log odds of repeating the just-taken action in future trials based on the feedback (+/- CA for positive or negative feedback, respectively; the model also includes gradual perseveration which allows for constant log-odd changes that are not affected by choice feedback) . The forgetting rate captures the extent to which the effect of each trial on future choices diminishes with time. The Q-values are thus exponentially decaying sums of logistic choice propensities based on the types of feedback a bandit received.”

      We also explain our cross-fitting procedure in more detail:

      “To further characterise deviations between behaviour and our Bayesian learning models, we used a “crossfitting” method. Treating CA parameters as data-features of interest (i.e., feedback dependent changes in choice propensity), our goal was to examine if and how empirical features differ from features extracted from simulations of our Bayesian learning models. Towards that goal, we simulated synthetic data based on Bayesian agents (using participants’ best fitting parameters), but fitted these data using the CA-models, obtaining what we term “Bayesian-CA parameters” (Fig. 2d; Methods). A comparison of these BayesianCA parameters, with empirical-CA parameters obtained by fitting CA models to empirical data, allowed us to uncover patterns consistent with, or deviating from, ideal-Bayesian value-based inference. Under the sophisticated logistic-regression interpretation of the CA-model family the cross-fitting method comprises a comparison between empirical regression coefficients (i.e., empirical CA parameters) and regression coefficient based on simulations of Bayesian models (Bayesian CA parameters).”

      (3) The Credibility-CA model seems to fit the same as the free-credibility Bayesian model in the first experiment and barely better in the second experiment. Why not use a more standard model comparison metric like the Bayesian Information Criterion (BIC)? Even if there are advantages to the bootstrap method (which should be described if so), the BIC would help for comparability between papers.

      We thank the reviewer for this important comment regarding our model comparison approach. We acknowledge that classical information criteria like AIC and BIC are widely used in RL studies. However, we argue our method for model-comparison is superior.

      We conducted a model recovery analysis demonstrating a significant limitation of using AIC or BIC for model-comparison in our data. Both these methods are strongly biased in favor of the Bayesian models. Our PBCM method, on the other hand, is both unbiased and more accurate. We believe this is because “off the shelf” methods like AIC and BIC rely on strong assumptions (such as asymptotic sample size and trial-independence) that are not necessarily met in our tasks (Data is finite; Trials in RL tasks depend on previous trials). PBCM avoids such assumptions to obtain comparison criteria specifically tailored to the structure and size of our empirical data. We have now mentioned this fact in the results section of the main text:

      “We considered using AIC and BIC, which apply “off-the shelf” penalties for model-complexity. However, these methods do not adapt to features like finite sample size (relying instead on asymptotic assumption) or temporal dependence (as is common in reinforcement learning experiments). In contrast, the parametric bootstrap cross-fitting method replaces these fixed penalties with empirical, data-driven criteria for modelselection. Indeed, model-recovery simulations confirmed that whereas AIC and BIC were heavily biased in favour of the Bayesian models, the bootstrap method provided excellent model-recovery (See Fig. S20).”

      We have also included such model recovery in the SI document:

      (4) As suggested in the discussion, the updating based on random feedback could be due to the interleaving of trials. If one is used to learning from the source on most trials, the occasional random trial may be hard to resist updating from. The exact interleaving structure should also be clarified (I assume different sources were shown for each bandit pair). This would also relate to work on RL and working memory: Collins, A. G., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35(7), 10241035.

      We thank the reviewer for this point. The specific interleaved structure of the agents is described in the main text:

      “Each agent provided feedback for 5 trials for each bandit pair (with the agent order interleaved within the bandit pair).”

      As well as in the methods section:

      “Feedback agents were randomly interleaved across trials subject to the constraint that each agent appeared on 5-trials for each bandit pair.”

      We also thank the reviewer for mentioning the relevant work on working memory. We have now added it to our discussion point:

      “In our main study, we show that participants revised their beliefs based on entirely non-credible feedback, whereas an ideal Bayesian strategy dictates such feedback should be ignored. This finding resonates with the “continued-influence effect” whereby misleading information continues to influence an individual's beliefs even after it has been retracted (59,60). One possible explanation is that some participants failed to infer that feedback from the 1-star agent was statistically void of information content, essentially random (e.g., the group-level credibility of this agent was estimated by our free-credibility Bayesian model as higher than 50%). Participants were instructed that this feedback would be “a lie” 50% of the time but were not explicitly told that this meant it was random and should therefore be disregarded. Notably, however, there was no corresponding evidence random feedback affected behaviour in our discovery study. It is possible that an individual’s ability to filter out random information might have been limited due to a high cognitive load induced by our main study task, which required participants to track the values of three bandit pairs and juggle between three interleaved feedback agents (whereas in our discovery study each experimental block featured a single bandit pair). Future studies should explore more systematically how the ability to filter random feedback depends on cognitive load (61).”

      (5) Why does the choice-repetition regression include "only trials for which the last same-pair trial featured the 3-star agent and in which the context trial featured a different bandit pair"? This could be stated more plainly.

      We thank the reviewer for this question. When we previously submitted our manuscript, we thought that finding enhanced credit-assignment for fully credible feedback following potential disinformation from a different context would constitute a striking demonstration of our “contrast effect”. However, upon reexamining this finding we found out we had a coding error (affecting how trials were filtered). We have now rerun and corrected this analysis. We have assessed the contrast effect for both "same-context" trials (where the contextual trial featured the same bandit pair as the learning trial) and "different-context" trials (where the contextual trial featured a different bandit pair). Our re-analysis reveals a selective significant contrast effect in the samecontext condition, but no significant effect in the different-context condition. We have updated the main text to reflect these corrected findings and provide a clearer explanation of the analysis:

      “A comparison of empirical and Bayesian credit-assignment parameters revealed a further deviation from ideal Bayesian learning: participants showed an exaggerated credit-assignment for the 3-star agent compared with Bayesian models [Wilcoxon signed-rank test, instructed-credibility Bayesian model (median difference=0.74, z=11.14); free-credibility Bayesian model (median difference=0.62, z=10.71), all p’s<0.001] (Fig. 3a). One explanation for enhanced learning for the 3-star agents is a contrast effect, whereby credible information looms larger against a backdrop of non-credible information. To test this hypothesis, we examined whether the impact of feedback from the 3-star agent is modulated by the credibility of the agent in the trial immediately preceding it. More specifically, we reasoned that the impact of a 3-star agent would be amplified by a “low credibility context” (i.e., when it is preceded by a low credibility trial). In a binomial mixed effects model, we regressed choice-repetition on feedback valence from the last trial featuring the same bandit pair (i.e., the learning trial) and the feedback agent on the trial immediately preceding that last trial (i.e., the contextual credibility; see Methods for model-specification). This analysis included only learning trials featuring the 3-star agent, and context trials featuring the same bandit pair as the learning trial (Fig. 4a). We found that feedback valence interacted with contextual credibility (F(2,2086)=11.47, p<0.001) such that the feedback-effect (from the 3-star agent) decreased as a function of the preceding context-credibility (3-star context vs. 2-star context: b= -0.29, F(1,2086)=4.06, p=0.044; 2star context vs. 1-star context: b=-0.41, t(2086)=-2.94, p=0.003; and 3-star context vs. 1-star context: b=0.69, t(2086)=-4.74, p<0.001) (Fig. 4b). This contrast effect was not predicted by simulations of our main models of interest (Fig. 4c). No effect was found when focussing on contextual trials featuring a bandit pair different than the one in the learning trial (see SI 3.5). Thus, these results support an interpretation that credible feedback exerts a greater impact on participants’ learning when it follows non-credible feedback, in the same learning context.”

      We have modified the discussion accordingly as well:

      “A striking finding in our study was that for a fully credible feedback agent, credit assignment was exaggerated (i.e., higher than predicted by our Bayesian models). Furthermore, the effect of fully credible feedback on choice was further boosted when it was preceded by a low-credibility context related to current learning. We interpret this in terms of a “contrast effect”, whereby veridical information looms larger against a backdrop of disinformation (21). One upshot is that exaggerated learning might entail a risk of jumping to premature conclusions based on limited credible evidence (e.g., a strong conclusion that a vaccine is produces significant side-effect risks based on weak credible information, following non-credible information about the same vaccine). An intriguing possibility, that could be tested in future studies, is that participants strategically amplify the extent of learning from credible feedback to dilute the impact of learning from noncredible feedback. For example, a person scrolling through a social media feed, encountering copious amounts of disinformation, might amplify the weight they assign to credible feedback in order to dilute effects of ‘fake news’. Ironically, these results also suggest that public campaigns might be more effective when embedding their messages in low-credibility contexts , which may boost their impact.”

      And we have included some additional analyses in the SI document:

      “3.5 Contrast effects for contexts featuring a different bandit

      Given that we observed a contrast effect when both the learning and the immediately preceding "context trial” involved the same pair of bandits, we next investigated whether this effect persisted when the context trial featured a different bandit pair – a situation where the context would be irrelevant to the current learning. Again, we used in a binomial mixed effects model, regressing choice-repetition on feedback valence in the learning trial and the feedback agent in the context trial. This analysis included only learning trials featuring the 3-star agent, and context trials featuring a different bandit pair than the learning trial (Fig. S22a). We found no significant evidence of an interaction between feedback valence and contextual credibility (F(2,2364)=0.21, p=0.81) (Fig. S22b). This null result was consistent with the range of outcomes predicted by our main computational models (Fig. S22c).

      We aimed to formally compare the influence of two types of contextual trials: those featuring the same bandit pair as the learning trial versus those featuring a different pair. To achieve this, we extended our mixedeffects model by incorporating a new predictor variable, "CONTEXT_TYPE" which coded whether the contextual trial involved the same bandit pair (coded as -0.5) or a different bandit pair (+0.5) compared to the learning trial. The Wilkinson notation for this expanded mixed-effects model is:

      𝑅𝐸𝑃𝐸𝐴𝑇 ~ 𝐶𝑂𝑁𝑇𝐸𝑋𝑇_𝑇𝑌𝑃𝐸 ∗ 𝐹𝐸𝐸𝐷𝐵𝐴𝐶𝐾 ∗ (𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>2-star</sub> + 𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>3-star</sub>) + 𝐵𝐸𝑇𝑇𝐸𝑅 + (1|𝑝𝑎𝑟𝑡𝑖𝑐𝑖𝑝𝑎𝑛𝑡)

      This expanded model revealed a significant three-way interaction between feedback valence, contextual credibility, and context type (F(2,4451) = 7.71, p<0.001). Interpreting this interaction, we found a 2-way interaction between context-source and feedback valence when the context was the same (F(2,4451) = 12.03, p<0.001), but not when context was different (F(2,4451) = 0.23, p = 0.79). Further interpreting the double feedback-valence * context-source interaction (for the same context) we obtained the same conclusions as reported in the main text.”

      (6) Why apply the "Truth-CA" model and not the Bayesian variant that it was motivated by?

      Thanks for this very useful suggestion. We are unsure if we fully understand the question. The Truth-CA model was not motivated by a new Bayesian model. Our Bayesian models were simply used to make the point that participants may partially discriminate between truthful and untruthful feedback (for a given source). This led to the idea that perhaps more credit is assigned for truth (than lie) trials, which is what we found using our Truth-CA model. Note we show that our Bayesian models cannot account for this modulation.

      We have now improved our "Truth-CA" model. Previously, our Truth-CA model considered whether feedback on each trial was true or not based on realized latent true outcomes. However, it is possible that the very same feedback would have had an opposite truth-status if the latent true outcome was different (recall true outcomes are stochastic). This injects noise into the trial classification in our previous model. To avoid this, in our new model feedback is modulated by the probability the reported feedback is true (marginalized over stochasticity of true outcome).

      We have described this new model in the methods section:

      “Additionally, we formulated a “Truth-CA” model, which worked as our Credibility-CA model, but incorporated a free truth-bonus parameter (TB). This parameter modulates the extent of credit assignment for each agent based on the posterior probability of feedback being true (given the credibility of the feedback agent, and the true reward probability of the chosen bandit). The chosen bandit was updated as follows:

      𝑄 ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄 + [𝐶𝐴(𝑎𝑔𝑒𝑛𝑡) + 𝑇𝐵 ∗ (𝑃(𝑡𝑟𝑢𝑡ℎ) − 0.5)] ∗ 𝐹

      where P(truth) is the posterior probability of the feedback being true in the current trial (for exact calculation of P(truth) see “Methods: Bayesian estimation of posterior belief that feedback is true”).”

      All relevant results have been updated accordingly in the main text:

      “To formally address whether feedback truthfulness modulates credit assignment, we fitted a new variant of the CA model (the “Truth-CA” model) to the data. This variant works as our Credibility-CA model but incorporated a truth-bonus parameter (TB) which increases the degree of credit assignment for feedback as a function of the experimenter-determined likelihood the feedback is true (which is read from the curves in Fig 6a when x is taken to be the true probability the bandit is rewarding). Specifically, after receiving feedback, the Q-value of the chosen option is updated according to the following rule: 𝑄 ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄 + [𝐶𝐴(𝑎𝑔𝑒𝑛𝑡) + 𝑇𝐵 ∗ (𝑃(𝑡𝑟𝑢𝑡ℎ) − 0.5)] ∗ 𝐹 where 𝑇𝐵 is the free parameter representing the truth bonus, and 𝑃(𝑡𝑟𝑢𝑡ℎ) is the probability the received feedback being true (from the experimenter’s perspective). We acknowledge that this model falls short of providing a mechanistically plausible description of the credit assignment process, because participants have no access to the experimenter’s truthfulness likelihoods (as the true bandit reward probabilities are unknown to them). Nonetheless, we use this ‘oracle model’ as a measurement tool to glean rough estimates for the extent to which credit assignment Is boosted as a function of its truthfulness likelihood. Fitting this Truth-CA model to participants' behaviour revealed a significant positive truth-bonus (mean=0.21, t(203)=3.12, p=0.002), suggesting that participants indeed assign greater weight to feedback that is likely to be true (Fig. 6c; see SI 3.3.1 for detailed ML parameter results). Notably, simulations using our other models (Methods) consistently predicted smaller truth biases (compared to the empirical bias) (Fig. 6d). Moreover, truth bias was still detected even in a more flexible model that allowed for both a positivity bias and truth-bias (see SI 3.7). The upshot is that participants are biased to assign higher credit based on feedback that is more likely to be true in a manner that is inconsistent with out Bayesian models and above and beyond the previously identified positivity biases.“

      Finally, the Supplementary Information for the discovery study has also been revised to feature this analysis:

      “We next assessed whether participants infer whether the feedback they received on each trial was true or false and adjust their credit assignment based on this inference. We again used the “Truth-CA” model to obtain estimates for the truth bonus (TB), the increase in credit assignment as a function of the posterior probability of feedback being true. As in our main study, the fitted truth bias parameter was significantly positive, indicating that participants assign greater weight to feedback they believe is likely to be true (Fig, S4a; see SI 3.3.1 for detailed ML parameter results). Strikingly, model-simulations (Methods) predicted a lower truth bonus than the one observed in participants (Fig. S4b).”

      (7) "Overall, the results from this study support the exact same conclusions (See SI section 1.2) but with one difference. In the discovery study, we found no evidence for learning based on 50%-credibility feedback when examining either the feedback effect on choice repetition or CA in the credibility-CA model (SI 1.2.3)" - this seems like a very salient difference, when the paper reports the feedback effect as a primary finding of interest, though I understand there remains a valence-based difference.

      We agree with the reviewer and thank them for this suggestion. We now state explicitly throughout the manuscript that this finding was obtained only in one of our two studies. In the section “Discovery study” of the results we state explicitly this finding was not found in the discovery study:

      “However, we found no evidence for learning based on 50%-credibility feedback when examining either the feedback effect on choice repetition or CA in the credibility-CA model (SI 1.2.3).”

      We also note that related to another concern from R3 (that perseveration may masquerade as positivity bias) we conducted additional analyses (detailed in SI 3.6.2). These analyses revealed that the observed positivity bias for the 1-star agent in the discovery study falls within the range predicted by simple choice-perseveration. Consequently, we have removed the suggestion that participants still learn from the random agent in the discovery study. Furthermore, we have modified the discussion section to include a possible explanation for this discrepancy between the two studies:

      “Notably, however, there was no corresponding evidence random feedback affected behaviour in our discovery study. It is possible that an individual’s ability to filter out random information might have been limited due to a high cognitive load induced by our main study task, which required participants to track the values of three bandit pairs and juggle between three interleaved feedback agents (whereas in our discovery study each experimental block featured a single bandit pair). Future studies should explore more systematically how the ability to filter random feedback depends on cognitive load (61).”

      (8) "Participants were instructed that this feedback would be "a lie 50% of the time but were not explicitly told that this meant it was random and should therefore be disregarded." - I agree that this is a possible explanation for updating from the random source. It is a meaningful caveat.

      Thank you for this thought. While this can be seen as a caveat—since we don’t know what would have happened with explicit instructions—we also believe it is interesting from another perspective. In many real-life situations, individuals may have all the necessary information to infer that the feedback they receive is uninformative, yet still fail to do so, especially when they are not explicitly told to ignore it.

      In future work, we plan to examine how behaviour changes when participants are given more explicit instructions—for example, that the 50%-credibility agent provides purely random feedback.

      (9) "Future studies should investigate conditions that enhance an ability to discard disinformation, such as providing explicit instructions to ignore misleading feedback, manipulations that increase the time available for evaluating information, or interventions that strengthen source memory." - there is work on some of this in the misinformation literature that should be cited, such as the "continued influence effect". For example: Johnson, H. M., & Seifert, C. M. (1994). Sources of the continued influence effect: When misinformation in memory affects later inferences. Journal of experimental psychology: Learning, memory, and cognition, 20(6), 1420.

      We thank the reviewer for pointing us towards the relevant literature. We have now included citations about the “continued influence effect” of misinformation in the discussion:

      “In our main study, we show that participants revised their beliefs based on entirely non-credible feedback, whereas an ideal Bayesian strategy dictates such feedback should be ignored. This finding resonates with the “continued-influence effect” whereby misleading information continues to influence an individual's beliefs even after it has been retracted (59,60).”

      (10) Are the authors arguing that choice-confirmation bias may be at play? Work on choice-confirmation bias generally includes counterfactual feedback, which is not present here.

      We agree with the reviewer that a definitive test for choice-confirmation bias typically requires counterfactual feedback, which is not present in our current task. In our discussion, we indeed suggest that the positivity bias we observe may stem from a form of choice-confirmation, drawing on the extensive literature on this bias in reinforcement learning (Lefebvre et al., 2017; Palminteri et al., 2017; Palminteri & Lebreton, 2022). However, we fully acknowledge that this link is a hypothesis and that explicitly testing for choice-confirmation bias would necessitate a future study specifically incorporating counterfactual feedback. We have included a clarification of this point in the discussion:

      “Previous reinforcement learning studies, report greater credit-assignment based on positive compared to negative feedback, albeit only in the context of veridical feedback (43,44,62). Here, supporting our a-priori hypothesis we show that this positivity bias is amplified for information of low and intermediate credibility (in absolute terms in the discovery study, and relative to the overall extent of CA in both studies) . Of note, previous literature has interpreted enhanced learning for positive outcomes in reinforcement learning as indicative of a confirmation bias (42,44). For example, positive feedback may confirm, to a greater extent than negative feedback one’s choice as superior (e.g., “I chose the better of the two options”). Leveraging the framework of motivated cognition (35), we posited that feedback of uncertain veracity (e.g., low credibility) amplifies this bias by incentivising individuals to self-servingly accept positive feedback as true (because it confers positive, desirable outcomes), and explain away undesirable, choice-disconfirming, negative feedback as false. This could imply an amplified confirmation bias on social media, where content from sources of uncertain credibility, such as unknown or unverified users, is more easily interpreted in a self-serving manner, disproportionately reinforcing existing beliefs (63). In turn, this could contribute to an exacerbation of the negative social outcomes previously linked to confirmation bias such as polarization (64,65), the formation of ‘echo chambers’ (19), and the persistence of misbelief regarding contemporary issues of importance such as vaccination (66,67) and climate change (68–71). We note however, that further studies are required to determine whether positivity bias in our task is indeed a form of confirmation bias.”

      Reviewer #3 (Public review):

      Summary

      This paper investigates how disinformation affects reward learning processes in the context of a two-armed bandit task, where feedback is provided by agents with varying reliability (with lying probability explicitly instructed). They find that people learn more from credible sources, but also deviate systematically from optimal Bayesian learning: They learned from uninformative random feedback, learned more from positive feedback, and updated too quickly from fully credible feedback (especially following low-credibility feedback). Overall, this study highlights how misinformation could distort basic reward learning processes, without appeal to higher-order social constructs like identity.

      Strengths

      (1) The experimental design is simple and well-controlled; in particular, it isolates basic learning processes by abstracting away from social context.

      (2) Modeling and statistics meet or exceed the standards of rigor.

      (3) Limitations are acknowledged where appropriate, especially those regarding external validity.

      (4) The comparison model, Bayes with biased credibility estimates, is strong; deviations are much more compelling than e.g., a purely optimal model.

      (5) The conclusions are interesting, in particular the finding that positivity bias is stronger when learning from less reliable feedback (although I am somewhat uncertain about the validity of this conclusion)

      We deeply thank the reviewer for highlighting the strengths of this work.

      Weaknesses

      (1) Absolute or relative positivity bias?

      In my view, the biggest weakness in the paper is that the conclusion of greater positivity bias for lower credible feedback (Figure 5) hinges on the specific way in which positivity bias is defined. Specifically, we only see the effect when normalizing the difference in sensitivity to positive vs. negative feedback by the sum. I appreciate that the authors present both and add the caveat whenever they mention the conclusion (with the crucial exception of the abstract). However, what we really need here is an argument that the relative definition is the right way to define asymmetry....

      Unfortunately, my intuition is that the absolute difference is a better measure. I understand that the relative version is common in the RL literature; however previous studies have used standard TD models, whereas the current model updates based on the raw reward. The role of the CA parameter is thus importantly different from a traditional learning rate - in particular, it's more like a logistic regression coefficient (as described below) because it scales the feedback but not the decay. Under this interpretation, a difference in positivity bias across credibility conditions corresponds to a three-way interaction between the exponentially weighted sum of previous feedback of a given type (e.g., positive from the 75% credible agent), feedback positivity, and condition (dummy coded). This interaction corresponds to the nonnormalized, absolute difference.

      Importantly, I'm not terribly confident in this argument, but it does suggest that we need a compelling argument for the relative definition.

      We thank the reviewer for raising this important point about the definition of positivity bias, and for their thoughtful discussion on the absolute versus relative measures. We believe that the relative valence bias offers a distinct and valuable perspective on positivity bias. Conceptually, this measure describes positivity bias in a manner akin to a “percentage difference” relative to the overall level of learning which allows us to control for the overall decreases in the overall amount of credit assignment as feedback becomes less credible. We are unsure if one measure is better or more correct than the other and we believe that reporting both measures enriches the understanding of positivity bias and allows for a more comprehensive characterization of this phenomenon (as long as these measures are interpreted carefully). We have stated the significance of the relative measure in the results section:

      “Following previous research, we quantified positivity bias in 2 ways: 1) as the absolute difference between credit-assignment based on positive or negative feedback, and 2) as the same difference but relative to the overall extent of learning. We note that the second, relative, definition, is more akin to “percentage change” measurements providing a control for the overall lower levels of credit-assignment for less credible agent.”

      We also wish to point out that in our discovery study we had some evidence for amplification of positivity bias in absolute sense.

      (2) Positivity bias or perseveration?

      A key challenge in interpreting many of the results is dissociating perseveration from other learning biases. In particular, a positivity bias (Figure 5) and perseveration will both predict a stronger correlation between positive feedback and future choice. Crucially, the authors do include a perseveration term, so one would hope that perseveration effects have been controlled for and that the CA parameters reflect true positivity biases. However, with finite data, we cannot be sure that the variance will be correctly allocated to each parameter (c.f. collinearity in regressions). The fact that CA- is fit to be negative for many participants (a pattern shown more strongly in the discovery study) is suggestive that this might be happening. A priori, the idea that you would ever increase your value estimate after negative feedback is highly implausible, which suggests that the parameter might be capturing variance besides that it is intended to capture.

      The best way to resolve this uncertainty would involve running a new study in which feedback was sometimes provided in the absence of a choice - this would isolate positivity bias. Short of that, perhaps one could fit a version of the Bayesian model that also includes perseveration. If the authors can show that this model cannot capture the pattern in Figure 5, that would be fairly convincing.

      We thank the reviewer for this very insightful and crucial point regarding the potential confound between positivity bias and perseveration. We entirely agree that distinguishing these effects can be challenging. To rigorously address this concern and ascertain that our observed positivity bias, particularly its inflation for low-credibility feedback, is not merely an artifact of perseveration, we conducted additional analyses as suggested.

      First, following the reviewer’s suggestion we simulated our Bayesian models, including a perseveration term, for both our main and discovery studies. Crucially, none of these simulations predicted the specific pattern of inflated positivity bias for low-credibility feedback that we identified in participants.

      Additionally, taking a “devil’s advocate” approach, we tested whether our credibility-CA model (which includes perseveration but not a feedback valence bias) can predict our positivity bias findings. Thus, we simulated 100 datasets using our Credibility-CA model (based on empirical best-fitting parameters). We then fitted each of these simulated datasets using our CredibilityValence CA model. By examining the distribution of results across these synthetic datasets fits and comparing them to the actual results from participants, we found that while perseveration could indeed lead (as the reviewer suspected) to an artifactual positivity bias, it could not predict the magnitude of the observed inflation of positivity bias for low-credibility feedback (whether measured in absolute or relative terms).

      Based on these comprehensive analyses, we are confident that our main results concerning the modulation of a valence bias as a function of source-credibility cannot be accounted by simple choice-perseveration. We have briefly explained these analyses in the main results section:

      “Previous research has suggested that positivity bias may spuriously arise from pure choice-perseveration (i.e., a tendency to repeat previous choices regardless of outcome) (49,50). While our models included a perseveration-component, this control may not be preferent. Therefore, in additional control analyses, we generated synthetic datasets using models including choice-perseveration but devoid of feedback-valence bias, and fitted them with our credibility-valence model (see SI 3.6.1). These analyses confirmed that perseveration can masquerade as an apparent positivity bias. Critically, however, these analyses also confirmed that perseveration cannot account for our main finding of increased positivity bias, relative to the overall extent of CA, for low-credibility feedback.”

      Additionally, we have added a detailed description of these additional analyses and their findings to the Supplementary Information document:

      “3.6 Positivity bias results cannot be explained by a pure perseveration

      3.6.1 Main study

      Previous research has suggested it may be challenging to dissociate between a feedback-valence positivity bias and perseveration (i.e., a tendency to repeat previous choices regardless of outcome). While our Credit Assignment (CA) models already include a perseveration mechanism to account for this, this control may not be perfect. We thus conducted several tests to examine if our positivity-bias related results could be accounted for by perseveration.

      First we examined whether our Bayesian-models, augmented by a perseveration mechanism (as in our CA model) can generate predictions similar to our empirical results. We repeated our cross-fitting procedure to these extended Bayesian models. To briefly recap, this involved fitting participant behavior with them, generating synthetic datasets based on the resulting maximum likelihood (ML) parameters, and then fitting these simulated datasets with our Credibility-Valence CA model (which is designed to detect positivity bias). This test revealed that adding perseveration to our Bayesian models did not predict a positivity bias in learning. In absolute terms there was a small negativity bias (instructed-credibility Bayesian: b=−0.19, F(1,1218)=17.78, p<0.001, Fig. S23a-b; free-credibility Bayesian: b=−0.17, F(1,1218)=13.74, p<0.001, Fig. S23d-e). In relative terms we detected no valence related bias (instructed-credibility Bayesian: b=−0.034, F(1,609)=0.45, p=0.50, Fig. S22c; free-credibility Bayesian: b=−0.04, F(1,609)=0.51, p=0.47, Fig. S23f). More critically, these simulations also did not predict a change in the level of positivity bias as a function of feedback credibility, neither at an absolute level (instructed-credibility Bayesian: F(2,1218)=0.024, p=0.98, Fig. S23b; free-credibility Bayesian: F(2,1218)=0.008, p=0.99, Fig. S23e), nor at a relative level (instructedcredibility Bayesian: F(2,609)=1.57, p=0.21, Fig. S23c; free-credibility Bayesian: F(2,609)=0.13, p=0.88, Fig. S23f). The upshot is that our positivity-bias findings cannot be accounted for by our Bayesian models even when these are augmented with perseveration.

      However, it is still possible that empirical CA parameters from our credibility-valence model (reported in main text Fig. 5) were distorted, absorbing variance from a perseveration. To address this, we took a “devil's advocate” approach testing the assumption that CA parameters are not truly affected by feedback valance and that there is only perseveration in our data. Towards that goal, we simulated data using our CredibilityCA model (which includes perseveration but does not contain a valence bias in its learning mechanism) and then fitted these synthetic datasets using our Credibility-Valence CA model to see if the observed positivity bias could be explained by perseveration alone. Specifically, we generated 101 “group-level” synthetic datasets (each including one simulation for each participant, based on their empirical ML parameters), and fitted each dataset with our Credibility-Valence CA model. We then analysed the resulting ML parameters in each dataset using the same mixed-effects models as described in the main text, examining the distribution of effects of interest across these simulated datasets. Comparing these simulation results to the data from participants revealed a nuanced picture. While the positivity bias observed in participants is within the range predicted by a pure perseveration account when measured in absolute terms (Fig. S24a), it is much higher than predicted by pure perseveration when measured relative to the overall level of learning (Fig. S24c). More importantly, the inflation in positivity bias for lower credibility feedback is substantially higher in participants than what would be predicted by a pure perseveration account, a finding that holds true for both absolute (Fig. S24b) and relative (Fig. S24d) measures.”

      “3.6.2 Discovery study

      We then replicated these analyses in our discovery study to confirm our findings. We again checked whether extended versions of the Bayesian models (including perseveration) predicted the positivity bias results observed. Our cross-fitting procedure showed that the instructed-credibility Bayesian model with perseveration did predict a positivity bias for all credibility levels in this discovery study, both when measured in absolute terms [50% credibility (b=1.74,t(824)=6.15), 70% credibility (b=2.00,F(1,824)=49.98), 85% credibility (b=1.81,F(1,824)=40.78), 100% credibility (b=2.42,F(1,824)=72.50), all p's<0.001], and in relative terms [50% credibility (b=0.25,t(412)=3.44), 70% credibility (b=0.31,F(1,412)=17.72), 85% credibility (b=0.34,F(1,412)=21.06), 100% credibility (b=0.42,F(1,412)=31.24), all p's<0.001]. However, importantly, these simulations did not predict a change in the level of positivity bias as a function of feedback credibility, neither at an absolute level (F(3,412)=1.43,p=0.24), nor at a relative level (F(3,412)=2.06,p=0.13) (Fig. S25a-c). In contrast, simulations of the free-credibility Bayesian model (with perseveration) predicted a slight negativity bias when measured in absolute terms (b=−0.35,F(1,824)=5.14,p=0.024), and no valence bias when measured relative to the overall degree of learning (b=0.05,F(1,412)=0.55,p=0.46). Crucially, this model also did not predict a change in the level of positivity bias as a function of feedback credibility, neither at an absolute level (F(3,824)=0.27,p=0.77), nor at a relative level (F(3,412)=0.76,p=0.47) (Fig. S25d-f).

      As in our main study, we next assessed whether our Credibility-CA model (which includes perseveration but no valence bias) predicted the positivity bias results observed in participants in the discovery study. This analysis revealed that the average positivity bias in participants is higher than predicted by a pure perseveration account, both when measured in absolute terms (Fig. S26a) and in relative terms (Fig. S26c). Specifically, only the aVBI for the 70% credibility agent was above what a perseveration account would predict, while the rVBI for all agents except the completely credible one exceeded that threshold. Furthermore, the inflation in positivity bias for lower credibility feedback (compared to the 100% credibility agent) is significantly higher in participants than would be predicted by a pure perseveration account, in both absolute (Fig. S26b) and relative (Fig. S26d) terms.

      Together, these results show that the general positivity bias observed in participants could be predicted by an instructed-credibility Bayesian model with perseveration, or by a CA model with perseveration. Moreover, we find that these two models can predict a positivity bias for the 50% credibility agent, raising a concern that our positivity bias findings for this source may be an artefact of not-fully controlled for perseveration. However, the credibility modulation of this positivity bias, where the bias is amplified for lower credibility feedback, is consistently not predicted by perseveration alone, regardless of whether perseveration is incorporated into a Bayesian or a CA model. This finding suggests that participants are genuinely modulating their learning based on feedback credibility, and that this modulation is not merely an artifact of choice perseveration.”

      (3) Veracity detection or positivity bias?

      The "True feedback elicits greater learning" effect (Figure 6) may be simply a re-description of the positivity bias shown in Figure 5. This figure shows that people have higher CA for trials where the feedback was in fact accurate. But assuming that people tend to choose more rewarding options, true-feedback cases will tend to also be positive-feedback cases. Accordingly, a positivity bias would yield this effect, even if people are not at all sensitive to trial-level feedback veracity. Of course, the reverse logic also applies, such that the "positivity bias" could actually reflect discounting of feedback that is less likely to be true. This idea has been proposed before as an explanation for confirmation bias (see Pilgrim et al, 2024 https://doi.org/10.1016/j.cognition.2023.105693and much previous work cited therein). The authors should discuss the ambiguity between the "positivity bias" and "true feedback" effects within the context of this literature....

      Before addressing these excellent comments, we first note that we have now improved our "TruthCA" model. Previously, our Truth-CA model considered whether feedback on each trial was true or not based on realized latent true outcomes. However, it is possible that the very same feedback would have had an opposite truth-status if the latent true outcome was different (recall true outcomes are stochastic). This injects noise into the trial classification in our former model. To avoid this, in our new model feedback is modulated by the probability the reported feedback is true (marginalized over stochasticity of true outcome). Please note in our responses below that we conducted extensive analysis to confirm that positivity bias doesn’t in fact predict the truthbias we detect using our truth biased model

      We have described this new model in the methods section:

      “Additionally, we formulated a “Truth-CA” model, which worked as our Credibility-CA model, but incorporated a free truth-bonus parameter (TB). This parameter modulates the extent of credit assignment for each agent based on the posterior probability of feedback being true (given the credibility of the feedback agent, and the true reward probability of the chosen bandit). The chosen bandit was updated as follows:

      𝑄 ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄 + [𝐶𝐴(𝑎𝑔𝑒𝑛𝑡) + 𝑇𝐵 ∗ (𝑃(𝑡𝑟𝑢𝑡ℎ) − 0.5)] ∗ 𝐹

      where P(truth) is the posterior probability of the feedback being true in the current trial (for exact calculation of P(truth) see “Methods: Bayesian estimation of posterior belief that feedback is true”).”

      All relevant results have been updated accordingly in the main text:

      To formally address whether feedback truthfulness modulates credit assignment, we fitted a new variant of the CA model (the “Truth-CA” model) to the data. This variant works as our Credibility-CA model, but incorporated a truth-bonus parameter (TB) which increases the degree of credit assignment for feedback as a function of the experimenter-determined likelihood the feedback is true (which is read from the curves in Fig 6a when x is taken to be the true probability the bandit is rewarding). Specifically, after receiving feedback, the Q-value of the chosen option is updated according to the following rule:

      𝑄 ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄 + [𝐶𝐴(𝑎𝑔𝑒𝑛𝑡) + 𝑇𝐵 ∗ (𝑃(𝑡𝑟𝑢𝑡ℎ) − 0.5)] ∗ 𝐹

      where 𝑇𝐵 is the free parameter representing the truth bonus, and 𝑃(𝑡𝑟𝑢𝑡ℎ) is the probability the received feedback being true (from the experimenter’s perspective). We acknowledge that this model falls short of providing a mechanistically plausible description of the credit assignment process, because participants have no access to the experimenter’s truthfulness likelihoods (as the true bandit reward probabilities are unknown to them). Nonetheless, we use this ‘oracle model’ as a measurement tool to glean rough estimates for the extent to which credit assignment Is boosted as a function of its truthfulness likelihood.

      Fitting this Truth-CA model to participants' behaviour revealed a significant positive truth-bonus (mean=0.21, t(203)=3.12, p=0.002), suggesting that participants indeed assign greater weight to feedback that is likely to be true (Fig. 6c; see SI 3.3.1 for detailed ML parameter results). Notably, simulations using our other models (Methods) consistently predicted smaller truth biases (compared to the empirical bias) (Fig. 6d). Moreover, truth bias was still detected even in a more flexible model that allowed for both a positivity bias and truth-bias (see SI 3.7). The upshot is that participants are biased to assign higher credit based on feedback that is more likely to be true in a manner that is inconsistent with out Bayesian models and above and beyond the previously identified positivity biases.”

      Finally, the Supplementary Information for the discovery study has also been revised to feature this analysis:

      “We next assessed whether participants infer whether the feedback they received on each trial was true or false and adjust their credit assignment based on this inference. We again used the “Truth-CA” model to obtain estimates for the truth bonus (TB), the increase in credit assignment as a function of the posterior probability of feedback being true. As in our main study, the fitted truth bias parameter was significantly positive, indicating that participants assign greater weight to feedback they believe is likely to be true (Fig, S4a; see SI 3.3.1 for detailed ML parameter results). Strikingly, model-simulations (Methods) predicted a lower truth bonus than the one observed in participants (Fig. S4b).”

      Additionally, we thank the reviewer for pointing us to the relevant work by Pilgrim et al. (2024). We agree that the relationship between "true feedback" and "positivity bias" effects is nuanced, and their potential overlap warrants careful consideration. Note our analyses suggest that this is not solely the case. Firstly, simulations of our Credibility-Valence CA model predict only a small "truth bonus" effect, which is notably smaller than what we observed in participants. Secondly, we formulated an extension of our "Truth-CA" model that includes a valence bias in credit assignment. If our truth bonus results were merely an artifact of positivity bias, this extended model should absorb that variance, producing a null truth bonus parameter. However, fitting this model to participant data still revealed a significant positive truth bonus, which again exceeds the range predicted by simulations of our Credibility CA model:

      “3.7 Truth inference is still detected when controlling for valence bias

      Given that participants frequently select bandits that are, on average, mostly rewarding, it is reasonable to assume that positive feedback is more likely to be objectively true than negative feedback. This raises a question if the "truth inference" effect we observed in participants might simply be an alternative description of a positivity bias in learning. To directly test this idea, we extended our Truth-CA model to explicitly account for a valence bias in credit assignment. This extended model features separate CA parameters for positive and negative feedback for each agent. When we fitted this new model to participant behavior, it still revealed a significant truth bonus in both the main study (Wilkoxon’s signrank test: median = 0.09, z(202)=2.12, p=0.034; Fig. S27a) and the discovery study (median = 3.52, z(102)=7.86, p<0.001; Fig. S27c). Moreover, in the main study, this truth bonus remained significantly higher than what was predicted by all the alternative models, with the exception of the instructed-credibility bayesian model (Fig. S27b). In the discovery study, the truth bonus was significantly higher than what was predicted by all the alternative models (Fig. S27d).”

      Together, these findings suggest that our truth inference results are not simply a re-description of a positivity bias.

      Conversely, we acknowledge the reviewer's point that our positivity bias results could potentially stem from a more general truth inference mechanism. We believe that this possibility should be addressed in a future study where participants rate their belief that received feedback is true (rather than a lie).We have extended our discussion to clarify this possibility and to include the suggested citation:

      “Our findings show that individuals increase their credit assignment for feedback in proportion to the perceived probability that the feedback is true, even after controlling for source credibility and feedback valence. Strikingly, this learning bias was not predicted by any of our Bayesian or credit-assignment (CA) models. Notably, our evidence for this bias is based on a “oracle model” that incorporates the probability of feedback truthfulness from the experimenter's perspective, rather than the participant’s. This raises an important open question: how do individuals form beliefs about feedback truthfulness, and how do these beliefs influence credit assignment? Future research should address this by eliciting trial-by-trial beliefs about feedback truthfulness. Doing so would also allow for testing the intriguing possibility that an exaggerated positivity bias for non-credible sources reflects, to some extent, a truth-based discounting of negative feedback—i.e., participants may judge such feedback as less likely to be true. However, it is important to note that the positivity bias observed for fully credible sources (here and in other literature) cannot be attributed to a truth bias—unless participants were, against instructions, distrustful of that source.”

      The authors get close to this in the discussion, but they characterize their results as differing from the predictions of rational models, the opposite of my intuition. They write:

      “Alternative "informational" (motivation-independent) accounts of positivity and confirmation bias predict a contrasting trend (i.e., reduced bias in low- and medium credibility conditions) because in these contexts it is more ambiguous whether feedback confirms one's choice or outcome expectations, as compared to a full-credibility condition.”

      I don't follow the reasoning here at all. It seems to me that the possibility for bias will increase with ambiguity (or perhaps will be maximal at intermediate levels). In the extreme case, when feedback is fully reliable, it is impossible to rationally discount it (illustrated in Figure 6A). The authors should clarify their argument or revise their conclusion here.

      We apologize for the lack of clarity in our previous explanation. We removed the sentence you cited (it was intended to make a different point which we now consider non-essential). Our current narration is consistent with the point you are making.

      (4) Disinformation or less information?

      Zooming out, from a computational/functional perspective, the reliability of feedback is very similar to reward stochasticity (the difference is that reward stochasticity decreases the importance/value of learning in addition to its difficulty). I imagine that many of the effects reported here would be reproduced in that setting. To my surprise, I couldn't quickly find a study asking that precise question, but if the authors know of such work, it would be very useful to draw comparisons. To put a finer point on it, this study does not isolate which (if any) of these effects are specific to disinformation, rather than simply less information. I don't think the authors need to rigorously address this in the current study, but it would be a helpful discussion point.

      We thank the reviewer for highlighting the parallel (and difference) between feedback reliability and reward stochasticity. However, we have not found any comparable results in the literature. We also note that our discussion includes a paragraph addressing the locus of our effects making the point that more studies are necessary to determine whether our findings are due to disinformation per se or sources being less informative. While this paragraph was included in the previous version it led us to infer our Discussion was too long and we therefore shortened it considerably:

      “An important question arises as to the psychological locus of the biases we uncovered. Because we were interested in how individuals process disinformation—deliberately false or misleading information intended to deceive or manipulate—we framed the feedback agents in our study as deceptive, who would occasionally “lie” about the true choice outcome. However, statistically (though not necessarily psychologically), these agents are equivalent to agents who mix truth-telling with random “guessing” or “noise” where inaccuracies may arise from factors such as occasionally lacking access to true outcomes, simple laziness, or mistakes, rather than an intent to deceive. This raises the question of whether the biases we observed are driven by the perception of potential disinformation as deceitful per se or simply as deviating from the truth. Future studies could address this question by directly comparing learning from statistically equivalent sources framed as either lying or noisy. Unlike previous studies wherein participants had to infer source credibility from experience (30,37,72), we took an explicit-instruction approach, allowing us to precisely assess source-credibility impact on learning, without confounding it with errors in learning about the sources themselves. More broadly, our work connects with prior research on observational learning, which examined how individuals learn from the actions or advice of social partners (72–75). This body of work has demonstrated that individuals integrate learning from their private experiences with learning based on others’ actions or advice—whether by inferring the value others attribute to different options or by mimicking their behavior (57,76). However, our task differs significantly from traditional observational learning. Firstly, our feedback agents interpret outcomes rather than demonstrating or recommending actions (30,37,72). Secondly, participants in our study lack private experiences unmediated by feedback sources. Finally, unlike most observational learning paradigms, we systematically address scenarios with deliberately misleading social partners. Future studies could bridge this by incorporating deceptive social partners into observational learning, offering a chance to develop unified models of how individuals integrate social information when credibility is paramount for decision-making.”

      (5) Over-reliance on analyzing model parameters

      Most of the results rely on interpreting model parameters, specifically, the "credit assignment" (CA) parameter. Exacerbating this, many key conclusions rest on a comparison of the CA parameters fit to human data vs. those fit to simulations from a Bayesian model. I've never seen anything like this, and the authors don't justify or even motivate this analysis choice. As a general rule, analyses of model parameters are less convincing than behavioral results because they inevitably depend on arbitrary modeling assumptions that cannot be fully supported. I imagine that most or even all of the results presented here would have behavioral analogues. The paper would benefit greatly from the inclusion of such results. It would also be helpful to provide a description of the model in the main text that makes it very clear what exactly the CA parameter is capturing (see next point).

      We thank the reviewer for this important suggestion which we address together with the following point.

      (6) RL or regression?

      I was initially very confused by the "RL" model because it doesn't update based on the TD error. Consequently, the "Q values" can go beyond the range of possible reward (SI Figure 5). These values are therefore not Q values, which are defined as expectations of future reward ("action values"). Instead, they reflect choice propensities, which are sometimes notated $h$ in the RL literature. This misuse of notation is unfortunately quite common in psychology, so I won't ask the authors to change the variable. However, they should clarify when introducing the model that the Q values are not action values in the technical sense. If there is precedent for this update rule, it should be cited.

      Although the change is subtle, it suggests a very different interpretation of the model.

      Specifically, I think the "RL model" is better understood as a sophisticated logistic regression, rather than a model of value learning. Ignoring the decay term, the CA term is simply the change in log odds of repeating the just-taken action in future trials (the change is negated for negative feedback). The PERS term is the same, but ignoring feedback. The decay captures that the effect of each trial on future choices diminishes with time. Importantly, however, we can re-parameterize the model such that the choice at each trial is a logistic regression where the independent variables are an exponentially decaying sum of feedback of each type (e.g., positive-cred50, positive-cred75, ... negative-cred100). The CA parameters are simply coefficients in this logistic regression.

      Critically, this is not meant to "deflate" the model. Instead, it clarifies that the CA parameter is actually not such an assumption-laden model estimate. It is really quite similar to a regression coefficient, something that is usually considered "model agnostic". It also recasts the non-standard "cross-fitting" approach as a very standard comparison of regression coefficients for model simulations vs. human data. Finally, using different CA parameters for true vs false feedback is no longer a strange and implausible model assumption; it's just another (perfectly valid) regression. This may be a personal thing, but after adopting this view, I found all the results much easier to understand.

      We thank the reviewer for their insightful and illuminating comments, particularly concerning the interpretation of our model parameters and the nature of our Credit assignment model. We believe your interpretation of the model is accurate and we now narrate it to readers in the hope that our modelling will become clearer and more intuitively. We also present to readers how these recasts our “cross-fitting” approach in the way you suggested (we return to this point below).

      Broadly, while we agree that modelling results depend on underlying assumptions, we believe that “model-agnostic” approaches also have important limitations—especially in reinforcement learning (RL), where choices are shaped by histories of past events, which such approaches often fail to fully account for. As students of RL, we are frequently struck by how careful modelling demonstrates that seemingly meaningful “model-agnostic” patterns can emerge as artefacts of unaccounted-for variables. We also note that the term “model-agnostic” is difficult to define—after all, even regression models rely on assumptions, and some computational models make richer or more transparent assumptions than others. Ideally, we aim to support our findings using converging methods wherever possible.

      We want to clarify that many of our reported findings indeed stem from straightforward behavioral analyses (e.g., simple regressions of choice-repetition), which do not rely on complex modeling assumptions. The two key results that primarily depend on the analysis of model parameters are our findings related to positivity bias and truth inference.

      Regarding the positivity bias, identifying truly model-agnostic behavioral signatures, distinct from effects like choice-perseveration, has historically been a significant challenge in the literature. Classical research on this bias rests on the interpretation of model parameters (Lefebvre et al., 2017; Palminteri et al., 2017), or at least on the use of models to assess what an “unbiased learner” baseline should look like (Palminteri & Lebreton, 2022). Some researchers have suggested possible regressions incorporating history effects to detect positivity bias from choicerepetition behavior, but these regressions (as our model) rely on subtle assumptions about forgetting and history effects (Toyama et al., 2019). Specifically, in our case, this issue is also demonstrated by analysis we conducted related to the previous point the reviewer made (about perseveration masquerading as positivity bias). We believe that dissociating clearly positivity bias from perseveration is an important challenge for the field going forward.

      For our truth inference results, obtaining purely behavioral signatures is similarly challenging due to the intricate interdependencies (the reviewer has identified in previous points) between agent credibility, feedback valence, feedback truthfulness, and choice accuracy within our task design.

      Finally, we agree with the reviewer that regression coefficients are often interpreted as a “modelagnostic” pattern. From this perspective even our findings regarding positivity and truth bias are not a case of over-reliance on complex model assumptions but are rather a way to expose deviations between empirical “sophisticated” regression coefficients and coefficients predicted from Bayesian models.

      We have now described the main learning rule of our model in the main text to ensure that the meaning of the CA parameters is clearer for readers:

      “Next, we formulated a family of non-Bayesian computational RL models. Importantly, these models can flexibly express non-Bayesian learning patterns and, as we show in following sections, can serve to identify learning biases deviating from an idealized Bayesian strategy. Here, an assumption is that during feedback, the choice propensity for the chosen bandit (which here is represented by a point estimate, “Q value“, rather than a distribution) either increases or decreases (for positive or negative feedback, respectively) according to a magnitude quantified by the free “Credit-Assignment (CA)” model parameters (47):

      𝑄(𝑐ℎ𝑜𝑠𝑒𝑛) ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄(𝑐ℎ𝑜𝑠𝑒𝑛) + 𝐶𝐴(𝑎𝑔𝑒𝑛𝑡, 𝑣𝑎𝑙𝑒𝑛𝑐𝑒) ∗ 𝐹

      where F is the feedback received from the agents (coded as 1 for reward feedback and -1 for non-reward feedback), while fQ (∈[0,1]) is the free parameter representing the forgetting rate of the Q-value (Fig. 2a, bottom panel; Fig. S5b; Methods). The probability to choose a bandit (say A over B) in this family of models is a logistic function of the contrast choice-propensities between these two bandits. One interpretation of this model is as a “sophisticated” logistic regression, where the CA parameters take the role of “regression coefficients” corresponding to the change in log odds of repeating the just-taken action in future trials based on the feedback (+/- CA for positive or negative feedback, respectively; the model also includes gradual perseveration which allows for constant log-odd changes that are not affected by choice feedback; see “Methods: RL models”) . The forgetting rate captures the extent to which the effect of each trial on future choices diminishes with time. The Q-values are thus exponentially decaying sums of logistic choice propensities based on the types of feedback a bandit received.”

      We also explain the implications of this perspective for our cross-fitting procedure:

      “To further characterise deviations between behaviour and our Bayesian learning models, we used a “crossfitting” method. Treating CA parameters as data-features of interest (i.e., feedback dependent changes in choice propensity), our goal was to examine if and how empirical features differ from features extracted from simulations of our Bayesian learning models. Towards that goal, we simulated synthetic data based on Bayesian agents (using participants’ best fitting parameters), but fitted these data using the CA-models, obtaining what we term “Bayesian-CA parameters” (Fig. 2d; Methods). A comparison of these BayesianCA parameters, with empirical-CA parameters obtained by fitting CA models to empirical data, allowed us to uncover patterns consistent with, or deviating from, ideal-Bayesian value-based inference. Under the sophisticated logistic-regression interpretation of the CA-model family the cross-fitting method comprises a comparison between empirical regression coefficients (i.e., empirical CA parameters) and regression coefficient based on simulations of Bayesian models (Bayesian CA parameters). Using this approach, we found that both the instructed-credibility and free-credibility Bayesian models predicted increased BayesianCA parameters as a function of agent credibility (Fig. 3c; see SI 3.1.1.2 Tables S8 and S9). However, an in-depth comparison between Bayesian and empirical CA parameters revealed discrepancies from ideal Bayesian learning, which we describe in the following sections.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      (1) Keep terms consistent, e.g., follow-up vs. main; hallmark vs. traditional.

      We have now changed the text to keep terms consistent.

      (2) CA model is like a learning rate; but it's based on the raw reward, not the TD error - this seems strange.

      We thank the reviewer for this comment. We understand that the use of a CA model instead of a TD error model may seem unusual at first glance. However, the CA model offers an important advantage: it more easily accommodates what we term "negative learning rates". This means that some participants may treat certain agents (especially the random one) as consistently deceitful, leading them to effectively increase/reduce choice tendencies following negative/positive feedback. A CA model handles this naturally by allowing negative CA parameters as a simple extension of positive ones. In contrast, adapting a TD error model to account for this is more complex. For instance, attempting to introduce a "negative learning rate" makes the RW model behave in a non-stable manner (e.g., Q values become <0 or >1). At the initial stages of our project, we explored different approaches to dealing with this issue and we found the CA model provides the best approach. For these reasons, we decided to proceed with our CA model.

      Additionally, we used the CA model in previous studies (e.g., Moran, Dayan & Dolan (2021)) where we included (in SI) a detailed discussion of the similarities and difference between creditassignment and Rescorla-Wagner models

      (3) Why was the follow-up study not pre-registered?

      We appreciate the reviewer's comment regarding preregistration, which we should have done. Unfortunately, this is now “water under the bridge” but going forward we hope to pre-register increasing parts of our work.

      (4) Other work looking at reward stochasticity?

      As noted in point 4 of the main weaknesses, previous work on reward stochasticity primarily focused on explaining the increase/decrease in learning and its mechanistic bases under varying stochasticity levels. In our study, we uniquely characterize several specific learning biases that are modulated by source credibility, a topic not extensively explored within the existing reward stochasticity framework, as far as we know.

      (5) Equation 1 is different from the one in the figure?

      The reviewer is completely correct. The figure provides a simplified visual representation, primarily focusing on the feedback-based update of the Q-value, and for simplicity, it omits the forgetting term present in the full Equation 1. To ensure complete clarity and prevent any misunderstanding, we have now incorporated a more detailed explanation of the model, including the complete Equation 1 and its components, directly within the main text. This comprehensive description will ensure that readers are fully aware of how the model operates.

      “Next, we formulated a family of non-Bayesian computational RL models. Importantly, these models can flexibly express non-Bayesian learning patterns and, as we show in following sections, can serve to identify learning biases deviating from an idealized Bayesian strategy. Here, an assumption is that during feedback, the choice propensity for the chosen bandit (which here is represented by a point estimate, “Q value“, rather than a distribution) either increases or decreases (for positive or negative feedback, respectively) according to a magnitude quantified by the free “Credit-Assignment (CA)” model parameters (47):

      𝑄(𝑐ℎ𝑜𝑠𝑒𝑛) ← (1 – 𝑓<sub>Q</sub>) ∗ 𝑄(𝑐ℎ𝑜𝑠𝑒𝑛) + 𝐶𝐴(𝑎𝑔𝑒𝑛𝑡, 𝑣𝑎𝑙𝑒𝑛𝑐𝑒) ∗ 𝐹

      where F is the feedback received from the agents (coded as 1 for reward feedback and -1 for non-reward feedback), while fQ (∈[0,1]) is the free parameter representing the forgetting rate of the Q-value (Fig. 2a, bottom panel; Fig. S5b; Methods).”

      (6) Please describe/plot the distribution of all fitted parameters in the supplement. I would include the mean and SD in the main text (methods) as well.

      Following the reviewer’s suggestions, we have included in the Supplementary Document tables displaying the mean and SD of fitted parameters from participants for our main models of interest. We have also plotted the distributions of such parameters. Both for the main study:

      (7) "A novel approach within the disinformation literature by exploiting a Reinforcement Learning (RL) experimental framework".

      The idea of applying RL to disinformation is not new. Please tone down novelty claims. It would be nice to cite/discuss some of this work as well.

      https://arxiv.org/abs/2106.05402?utm_source=chatgpt.com https://www.scirp.org/pdf/jbbs_2022110415273931.pdf https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4173312

      We thank the reviewer for pointing us towards relevant literature. We have now toned down the sentence in the introduction and cited the references provided:

      “To address these questions, we adopt a novel approach within the disinformation literature by exploiting a Reinforcement Learning (RL) experimental framework (36). While RL has guided disinformation research in recent years (37–40), our approach is novel in using one of its most popular tasks: the “bandit task”.”

      (8) Figure 3a - The figures should be in the order that they're referenced (3 is referenced before 2).

      We generally try to stick to this important rule but, in this case, we believe that our ordering serves better the narrative and hope the reviewer will excuse this small violation.

      (9) "Additionally, we found a positive feedback-effect for the 3-star agent"

      What is the analysis here? To avoid confusion with the "positive feedback" effect, consider using "positive effect of feedback". The dash wasn't sufficient to avoid confusion in my case.

      We have now updated the terms in the text to avoid confusion.

      (10) The discovery study revealed even stronger results supporting a conclusion that the credibility-CA model was superior to both Bayesian models for most subjects

      This is very subjective, but I'll just mention that my "cherry-picking" flag was raised by this sentence. Are you only mentioning cases where the discovery study was consistent with the main study? Upon a closer read, I think the answer is most likely "no", but you might consider adopting a more systematic (perhaps even explicit) policy on when and how you reference the discovery study to avoid creating this impression in a more casual reader.

      We thank the reviewer for this valuable suggestion. To prevent any impression of "cherry-picking", we have removed specific references to the discovery study from the main body of the text. Instead, all discussions regarding the convergence and divergence of results between the two studies are now in the dedicated section focusing on the discovery study:

      “The discovery study (n=104) used a disinformation task structurally similar to that used in our main study, but with three notable differences: 1) it included 4 feedback agents, with credibilities of 50%, 70%, 85% and 100%, represented by 1, 2, 3, and 4 stars, respectively; 2) each experimental block consisted of a single bandit pair, presented over 16 trials (with 4 trials for each feedback agent); and 3) in certain blocks, unbeknownst to participants, the two bandits within a pair were equally rewarding (see SI section 1.1). Overall, this study's results supported similar conclusions as our main study (see SI section 1.2) with a few differences. We found convergent support for increased learning from more credible sources (SI 1.2.1), superior fit for the CA model over Bayesian models (SI 1.2.2) and increased learning from feedback inferred to be true (SI 1.2.6). Additionally, we found an inflation of positivity bias for low-credibility both when measured relative to the overall level of credit assignment (as in our main study), or in absolute terms (unlike in our main study) (Fig. S3; SI 1.2.5). Moreover, choice-perseveration could not predict an amplification of positivity bias for low-credibility sources (see SI 3.6.2). However, we found no evidence for learning based on 50%-credibility feedback when examining either the feedback effect on choice repetition or CA in the credibility-CA model (SI 1.2.3).”

      (11) An in-depth comparison between Bayesian and empirical CA parameters revealed discrepancies from normative Bayesian learning.

      Consider saying where this in-depth comparison can be found (based on my reading, I think you're referring to the next section?

      We have now modified the sentence for better clarity:

      “However, an in-depth comparison between Bayesian and empirical CA parameters revealed discrepancies from ideal Bayesian learning, which we describe in the following sections.”

      (12) "which essentially provides feedback" Perhaps you meant "random feedback"?

      We have modified the text as suggested by the reviewer.

      <(13) Essentially random

      Why "essentially"? Isn't it just literally random?

      We have modified the text as suggested by the reviewer.

      (14) Both Bayesian models predicted an attenuated credit-assignment for the 3-star agent

      Attenuated relative to what? I wouldn't use this word if you mean weaker than what we see in the human data. Instead, I would say people show an exaggerated credit-assignment, since Bayes is the normative baseline.

      We changed the text according to the reviewer’s suggestion:

      “A comparison of empirical and Bayesian credit-assignment parameters revealed a further deviation from ideal Bayesian learning: participants showed an exaggerated credit-assignment for the 3-star agent compared with Bayesian models.”

      (15) "there was no difference between 2-star and 3-star agent contexts (b=0.051, F(1,2419)=0.39, p=0.53)"

      You cannot confirm the null hypothesis! Instead, you can write "The difference between 2-star and 3-star agent contexts was not significant". Although even with this language, you should be careful that your conclusions don't rest on the lack of a difference (the next sentence is somewhat ambiguous on this point).

      Additionally, the reported b coefs do not match the figure, which if anything, suggests a larger drop from 0.75 (2-star) to 1 (3-star). Is this a mixed vs fixed effects thing? It would be helpful to provide an explanation here.

      We thank the reviewer for this question. When we previously submitted our manuscript, we thought that finding enhanced credit-assignment for fully credible feedback following potential disinformation from a DIFFERENT context would constitute a striking demonstration of our “contrast effect”. However, upon reexamining this finding we found out we had a coding error (affecting how trials were filtered). We have now rerun and corrected this analysis. We have assessed the contrast effect for both "same-context" trials (where the contextual trial featured the same bandit pair as the learning trial) and "different-context" trials (where the contextual trial featured a different bandit pair). Our re-analysis reveals a selective significant contrast effect in the same-context condition, but no significant effect in the different-context condition. We have updated the main text to reflect these corrected findings and provide a clearer explanation of the analysis:

      “A comparison of empirical and Bayesian credit-assignment parameters revealed a further deviation from ideal Bayesian learning: participants showed an exaggerated credit-assignment for the 3-star agent compared with Bayesian models [Wilcoxon signed-rank test, instructed-credibility Bayesian model (median difference=0.74, z=11.14); free-credibility Bayesian model (median difference=0.62, z=10.71), all p’s<0.001] (Fig. 3a). One explanation for enhanced learning for the 3-star agents is a contrast effect, whereby credible information looms larger against a backdrop of non-credible information. To test this hypothesis, we examined whether the impact of feedback from the 3-star agent is modulated by the credibility of the agent in the trial immediately preceding it. More specifically, we reasoned that the impact of a 3-star agent would be amplified by a “low credibility context” (i.e., when it is preceded by a low credibility trial). In a binomial mixed effects model, we regressed choice-repetition on feedback valence from the last trial featuring the same bandit pair (i.e., the learning trial) and the feedback agent on the trial immediately preceding that last trial (i.e., the contextual credibility; see Methods for model-specification). This analysis included only learning trials featuring the 3-star agent, and context trials featuring the same bandit pair as the learning trial (Fig. 4a). We found that feedback valence interacted with contextual credibility (F(2,2086)=11.47, p<0.001) such that the feedback-effect (from the 3-star agent) decreased as a function of the preceding context-credibility (3-star context vs. 2-star context: b= -0.29, F(1,2086)=4.06, p=0.044; 2star context vs. 1-star context: b=-0.41, t(2086)=-2.94, p=0.003; and 3-star context vs. 1-star context: b=0.69, t(2086)=-4.74, p<0.001) (Fig. 4b). This contrast effect was not predicted by simulations of our main models of interest (Fig. 4c). No effect was found when focussing on contextual trials featuring a bandit pair different than the one in the learning trial (see SI 3.5). Thus, these results support an interpretation that credible feedback exerts a greater impact on participants’ learning when it follows non-credible feedback, in the same learning context.”

      We have modified the discussion accordingly as well:

      “A striking finding in our study was that for a fully credible feedback agent, credit assignment was exaggerated (i.e., higher than predicted by our Bayesian models). Furthermore, the effect of fully credible feedback on choice was further boosted when it was preceded by a low-credibility context related to current learning. We interpret this in terms of a “contrast effect”, whereby veridical information looms larger against a backdrop of disinformation (21). One upshot is that exaggerated learning might entail a risk of jumping to premature conclusions based on limited credible evidence (e.g., a strong conclusion that a vaccine produces significant side-effect risks based on weak credible information, following non-credible information about the same vaccine). An intriguing possibility, that could be tested in future studies, is that participants strategically amplify the extent of learning from credible feedback to dilute the impact of learning from noncredible feedback. For example, a person scrolling through a social media feed, encountering copious amounts of disinformation, might amplify the weight they assign to credible feedback in order to dilute effects of ‘fake news’. Ironically, these results also suggest that public campaigns might be more effective when embedding their messages in low-credibility contexts, which may boost their impact.”

      And we have included some additional analyses in the SI document:

      “3.5 Contrast effects for contexts featuring a different bandit Given that we observed a contrast effect when both the learning and the immediately preceding "context trial” involved the same pair of bandits, we next investigated whether this effect persisted when the context trial featured a different bandit pair – a situation where the context would be irrelevant to the current learning. Again, we used in a binomial mixed effects model, regressing choice-repetition on feedback valence in the learning trial and the feedback agent in the context trial. This analysis included only learning trials featuring the 3-star agent, and context trials featuring a different bandit pair than the learning trial (Fig. S22a). We found no significant evidence of an interaction between feedback valence and contextual credibility (F(2,2364)=0.21, p=0.81) (Fig. S22b). This null result was consistent with the range of outcomes predicted by our main computational models (Fig. S22c).”

      We aimed to formally compare the influence of two types of contextual trials: those featuring the same bandit pair as the learning trial versus those featuring a different pair. To achieve this, we extended our mixedeffects model by incorporating a new predictor variable, "CONTEXT_TYPE" which coded whether the contextual trial involved the same bandit pair (coded as -0.5) or a different bandit pair (+0.5) compared to the learning trial. The Wilkinson notation for this expanded mixed-effects model is:

      𝑅𝐸𝑃𝐸𝐴𝑇 ~ 𝐶𝑂𝑁𝑇𝐸𝑋𝑇_𝑇𝑌𝑃𝐸 ∗ 𝐹𝐸𝐸𝐷𝐵𝐴𝐶𝐾 ∗ (𝐶 𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>2-star</sub> + 𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>3-star</sub>) + 𝐵𝐸𝑇𝑇𝐸𝑅 + (1|𝑝𝑎𝑟𝑡𝑖𝑐𝑖𝑝𝑎𝑛𝑡)

      This expanded model revealed a significant three-way interaction between feedback valence, contextual credibility, and context type (F(2,4451) = 7.71, p<0.001). Interpreting this interaction, we found a 2-way interaction between context-source and feedback valence when the context was the same (F(2,4451) = 12.03, p<0.001), but not when context was different (F(2,4451) = 0.23, p = 0.79). Further interpreting the double feedback-valence * context-source interaction (for the same context) we obtained the same conclusions as reported in the main text.”

      (16) "Strikingly, model-simulations (Methods) showed this pattern is not predicted by any of our other models"

      Why doesn't the Bayesian model predict this?

      Thanks for the comment. Overall, Bayesian models do predict a slight truth inference effect (see Figure 6d). However, these effects are not as strong as the ones observed in participants, suggesting that our results go beyond what would be predicted by a Bayesian model.

      Conceptually, it's important to note that the Bayesian model can infer (after controlling for source credibility and feedback valence) whether feedback is truthful based solely on prior beliefs about the chosen bandit. Using this inferred truth to amplify the weight of truthful feedback would effectively amount to “bootstrapping on one’s own beliefs.” This is most clearly illustrated with the 50% agent: if one believes that a chosen bandit yields rewards 70% of the time, then positive feedback is more likely to be truthful than negative feedback. However, a Bayesian observer would also recognize that, given the agent’s overall unreliability, such feedback should be ignored regardless.

      (17) "A striking finding in our study was that for a fully credible feedback agent, credit assignment was exaggerated (i.e., higher than predicted by a Bayesian strategy)".

      "Since we did not find any significant interactions between BETTER and the other regressors, we decided to omit it from the model formulation".

      Was this decision made after seeing the data? If so, please report the original analysis as well.

      We have included the BETTER regressor again, and we have re-run the analyses. We now report the results of such regression. We have also changed the methods section accordingly:

      “We used a different mixed-effects binomial regression model to test whether value learning from the 3-star agent was modulated by contextual credibility. We focused this analysis on instances where the previous trial with the same bandit pair featured the 3-star agent. We regressed the variable REPEAT, which indicated whether the current trial repeated the choice from the previous trial featuring the same bandit-pair (repeated choice=1, non-repeated choice=0). We included the following regressors: FEEDBACK coding the valence of feedback in the previous trial with the same bandit pair (positive=0.5, negative=-0.5), CONTEXT2-star indicating whether the trial immediately preceding the previous trial with the same bandit pair (context trial) featured the 2-star agent (feedback from 2-star agent=1, otherwise=0), and CONTEXT3star indicating whether the trial immediately preceding the previous trial with the same bandit pair featured the 3-star agent. We also included a regressor (BETTER) coding whether the bandit chosen in the learning trial was the better -mostly rewarding- or the worse -mostly unrewarding- bandit within the pair. We included in this analysis only current trials where the context trial featured a different bandit pair. The model in Wilkinson’s notation was:

      𝑅𝐸𝑃𝐸𝐴𝑇~ 𝐹𝐸𝐸𝐷𝐵𝐴𝐶𝐾 ∗ (𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>2-star</sub> + 𝐶𝑂𝑁𝑇𝐸𝑋𝑇<sub>3-star</sub>) + 𝐵𝐸𝑇𝑇𝐸𝑅 + (1|𝑝𝑎𝑟𝑡𝑖𝑐𝑖𝑝𝑎𝑛𝑡) ( 13 )

      In figure 4c, we independently calculate the repeat probability difference for the better (mostly rewarding) and worse (mostly non-rewarding) bandits and averaged across them. This calculation was done at the participants level, and finally averaged across participants.”

    1. Where higher education forwomen was advocated for material reasons, stress was laid on the need to equip middle-class women toearn a living as teachers and protect them from the risk of downward mobilit

      For Jourdain, it allowed her to 1) pursue a career in higher education; 2) enrol for a doctorate in Paris 3) write a dissertation 4) be hired as Vice-head at St. Hughe