26,924 Matching Annotations
  1. Feb 2024
    1. Joint Public Review:

      Yamanaka et al.'s research investigates into the impact of volatile organic compounds (VOCs), particularly diacetyl, on gene expression changes. By inhibiting histone acetylase (HDACs) enzymes, the authors were able to observe changes in the transcriptome of various models, including cell lines, flies, and mice. The study reveals that HDAC inhibitors not only reduce cancer cell proliferation but also provide relief from neurodegeneration in fly Huntington's disease models. The revised manuscript addresses the key queries raised in the initial reviews.

    1. eLife assessment

      This study presents a useful method for the extraction of behaviour-related activity from neural population recordings based on a specific deep learning architecture - a variational autoencoder. However, the evidence supporting the scientific claims resulting from the application of this method is incomplete as the results may stem, in part, from its properties. The main limitations are: (1) benchmarking against comparable methods is limited; and (2) some observations may be a byproduct of their method, and may not constitute new scientific observations.

    2. Reviewer #1 (Public Review):

      This work seeks to understand how behaviour-related information is represented in the neural activity of the primate motor cortex. To this end, a statistical model of neural activity is presented that enables a non-linear separation of behaviour-related from unrelated activity. As a generative model, it enables the separate analysis of these two activity modes, here primarily done by assessing the decoding performance of hand movements the monkeys perform in the experiments. Several lines of analysis are presented to show that while the neurons with significant tuning to movements strongly contribute to the behaviourally-relevant activity subspace, less or un-tuned neurons also carry decodable information. It is further shown that the discovered subspaces enable linear decoding, leading the authors to conclude that motor cortex read-out can be linear.

      Strengths:

      In my opinion, using an expressive generative model to analyse neural state spaces is an interesting approach to understand neural population coding. While potentially sacrificing interpretability, this approach allows capturing both redundancies and synergies in the code as done in this paper. The model presented here is a natural non-linear extension of a previous linear model PSID) and uses weak supervision in a manner similar to a previous non-linear model (TNDM).

      Weaknesses:

      This revised version provides additional evidence to support the author's claims regarding model performance and interpretation of the structure of the resulting latent spaces, in particular the distributed neural code over the whole recorded population, not just the well-tuned neurons. The improved ability to linearly decode behaviour from the relevant subspace and the analysis of the linear subspace projections in my opinion convincingly demonstrates that the model picks up behaviour-relevant dynamics, and that these are distributed widely across the population. As reviewer 3 also points out, I would, however, caution to interpret this as evidence for linear read-out of the motor system - your model performs a non-linear transformation, and while this is indeed linearly decodable, the motor system would need to do something similar first to achieve the same. In fact to me it seems to show the opposite, that behaviour-related information may not be generally accessible to linear decoders (including to down-stream brain areas).

      As in my initial review, I would also caution against making strong claims about identifiability although this work and TNDM seem to show that in practise such methods work quite well. CEBRA, in contrast, offers some theoretical guarantees, but it is not a generative model, so would not allow the type of analysis done in this paper. In your model there is a para,eter \alpha to balance between neural and behaviour reconstruction. This seems very similar to TNDM and has to be optimised - if this is correct, then there is manual intervention required to identify a good model.

      Somewhat related, I also found that the now comprehensive comparison with related models shows that the using decoding performance (R2) as a metric for model comparison may be problematic: the R2 values reported in Figure 2 (e.g. the MC_RTT dataset) should be compared to the values reported in the neural latent benchmark, which represent well-tuned models (e.g. AutoLFADS). The numbers (difficult to see, a table with numbers in the appendix would be useful, see: https://eval.ai/web/challenges/challenge-page/1256/leaderboard) seem lower than what can be obtained with models without latent space disentanglement. While this does not necessarily invalidate the conclusions drawn here, it shows that decoding performance can depend on a variety of model choices, and may not be ideal to discriminate between models. I'm also surprised by the low neural R2 for LFADS I assume this is condition-averaged) - LFADS tends to perform very well on this metric.

      One statement I still cannot follow is how the prior of the variational distribution is modelled. You say you depart from the usual Gaussian prior, but equation 7 seems to suggest there is a normal prior. Are the parameters of this distribution learned? As I pointed out earlier, I however suspect this may not matter much as you give the prior a very low weight. I also still am not sure how you generate a sample from the variational distribution, do you just draw one for each pass?

      Summary:

      This paper presents a very interesting analysis, but some concerns remain that mainly stem from the complexity of deep learning models. It would be good to acknowledge these as readers without relevant background need to understand where the possible caveats are.

    3. Reviewer #2 (Public Review):

      Li et al present a method to extract "behaviorally relevant" signals from neural activity. The method is meant to solve a problem which likely has high utility for neuroscience researchers. There are numerous existing methods to achieve this goal some of which the authors compare their method to-thankfully, the revised version includes one of the major previous omissions (TNDM). However, I still believe that d-VAE is a promising approach that has its own advantages. Still, I have issues with the paper as-is. The authors have made relatively few modifications to the text based on my previous comments, and the responses have largely just dismissed my feedback and restated claims from the paper. Nearly all of my previous comments remain relevant for this revised manuscript. As such, they have done little to assuage my concerns, the most important of which I will restate here using the labels/notation (Q1, Q2, etc) from the reviewer response.

      Q1) I still remain unconvinced that the core findings of the paper are "unexpected". In the response to my previous Specific Comment #1, they say "We use the term 'unexpected' due to the disparity between our findings and the prior understanding concerning neural encoding and decoding." However, they provide no citations or grounding for why they make those claims. What prior understanding makes it unexpected that encoding is more complex than decoding given the entropy, sparseness, and high dimensionality of neural signals (the "encoding") compared to the smoothness and low dimensionality of typical behavioural signals (the "decoding")?

      Q2) I still take issue with the premise that signals in the brain are "irrelevant" simply because they do not correlate with a fixed temporal lag with a particular behavioural feature hand-chosen by the experimenter. In the response to my previous review, the authors say "we employ terms like 'behaviorally-relevant' and 'behaviorally-irrelevant' only regarding behavioral variables of interest measured within a given task, such as arm kinematics during a motor control task.". This is just a restatement of their definition, not a response to my concern, and does not address my concern that the method requires a fixed temporal lag and continual decoding/encoding. My example of reward signals remains. There is a huge body of literature dating back to the 70s on the linear relationships between neural and activity and arm kinematics; in a sense, the authors have chosen the "variable of interest" that proves their point. This all ties back to the previous comment: this is mostly expected, not unexpected, when relating apparently-stochastic, discrete action potential events to smoothly varying limb kinematics.

      Q5) The authors seem to have missed the spirit of my critique: to say "linear readout is performed in motor cortex" is an over-interpretation of what their model can show.

      Q7) Agreeing with my critique is not sufficient; please provide the data or simulations that provides the context for the reference in the fano factor. I believe my critique is still valid.

      Q8) Thank you for comparing to TNDM, it's a useful benchmark.

    4. Reviewer #4 (Public Review):

      I am a new reviewer for this manuscript, which has been reviewed before. The authors provide a variational autoencoder that has three objectives in the loss: linear reconstruction of behavior from embeddings, reconstruction of neural data, and KL divergence term related to the variational model elements. They take the output of the VAE as the "behaviorally relevant" part of neural data and call the residual "behaviorally irrelevant". Results aim to inspect the linear versus nonlinear behavior decoding using the original raw neural data versus the inferred behaviorally relevant and irrelevant parts of the signal.

      Overall, studying neural computations that are behaviorally relevant or not is an important problem, which several previous studies have explored (for example PSID in (Sani et al. 2021), TNDM in (Hurwitz et al. 2021), TAME-GP in (Balzani et al. 2023), pi-VAE in (Zhou and Wei 2020), and dPCA in (Kobak et al. 2016), etc). However, this manuscript does not properly put their work in the context of such prior works. For example, the abstract states "One solution is to accurately separate behaviorally-relevant and irrelevant signals, but this approach remains elusive", which is not the case given that these prior works have done that. The same is true for various claims in the main text, for example "Furthermore, we found that the dimensionality of primary subspace of raw signals (26, 64, and 45 for datasets A, B, and C) is significantly higher than that of behaviorally-relevant signals (7, 13, and 9), indicating that using raw signals to estimate the neural dimensionality of behaviors leads to an overestimation" (line 321). This finding was presented in (Sani et al. 2021) and (Hurwitz et al. 2021), which is not clarified here. This issue of putting the work in context has been brought up by other reviewers previously but seems to remain largely unaddressed. The introduction is inaccurate also in that it mixes up methods that were designed for separation of behaviorally relevant information with those that are unsupervised and do not aim to do so (e.g., LFADS). The introduction should be significantly revised to explicitly discuss prior models/works that specifically formulated this behavior separation and what these prior studies found, and how this study differs.

      Beyond the above, some of the main claims/conclusions made by the manuscript are not properly supported by the analyses and results, which has also been brought up by other reviewers but not fully addressed. First, the analyses here do not support the linear readout from the motor cortex because i) by construction, the VAE here is trained to have a linear readout from its embedding in its loss, which can bias its outputs toward doing well with a linear decoder/readout, and ii) the overall mapping from neural data to behavior includes both the VAE and the linear readout and thus is always nonlinear (even when a linear Kalman filter is used for decoding). This claim is also vague as there is no definition of readout from "motor cortex" or what it means. Why is the readout from the bottleneck of this particular VAE the readout of motor cortex? Second, other claims about properties of individual neurons are also confounded because the VAE is a population-level model that extracts the bottleneck from all neurons. Thus, information can leak from any set of neurons to other sets of neurons during the inference of behaviorally relevant parts of signals. Overall, the results do not convincingly support the claims, and thus the claims should be carefully revised and significantly tempered to avoid misinterpretation by readers.

      Below I briefly expand on these as well as other issues, and provide suggestions:

      1) Claims about linearity of "motor cortex" readout are not supported by results yet stated even in the abstract. Instead, what the results support is that for decoding behavior from the output of the dVAE model -- that is trained specifically to have a linear behavior readout from its embedding -- a nonlinear readout does not help. This result can be biased by the very construction of the dVAE's loss that encourages a linear readout/decoding from embeddings, and thus does not imply a finding about motor cortex.

      2) Related to the above, it is unclear what the manuscript means by readout from motor cortex. A clearer definition of "readout" (a mapping from what to what?) in general is needed. The mapping that the linearity/nonlinearity claims refer to is from the *inferred* behaviorally relevant neural signals, which themselves are inferred nonlinearly using the VAE. This should be explicitly clarified in all claims, i.e., that only the mapping from distilled signals to behavior is linear, not the whole mapping from neural data to behavior. Again, to say the readout from motor cortex is linear is not supported, including in the abstract.

      3) Claims about individual neurons are also confounded. The d-VAE distilling processing is a population level embedding so the individual distilled neurons are not obtainable on their own without using the population data. This population level approach also raises the possibility that information can leak from one neuron to another during distillation, which is indeed what the authors hope would recover true information about individual neurons that wasn't there in the recording (the pixel denoising example). The authors acknowledge the possibility that information could leak to a neuron that didn't truly have that information and try to rule it out to some extent with some simulations and by comparing the distilled behaviorally relevant signals to the original neural signals. But ultimately, the distilled signals are different enough from the original signals to substantially improve decoding of low information neurons, and one cannot be sure if all of the information in distilled signals from any individual neuron truly belongs to that neuron. It is still quite likely that some of the improved behavior prediction of the distilled version of low-information neurons is due to leakage of behaviorally relevant information from other neurons, not the former's inherent behavioral information. This should be explicitly acknowledged in the manuscript.

      4) Given the nuances involved in appropriate comparisons across methods and since two of the datasets are public, the authors should provide their complete code (not just the dVAE method code), including the code for data loading, data preprocessing, model fitting and model evaluation for all methods and public datasets. This will alleviate concerns and allow readers to confirm conclusions (e.g., figure 2) for themselves down the line.

      5) Related to 1) above, the authors should explore the results if the affine network h(.) (from embedding to behavior) was replaced with a nonlinear ANN. Perhaps linear decoders would no longer be as close to nonlinear decoders. Regardless, the claim of linearity should be revised as described in 1) and 2) above, and all caveats should be discussed.

      6) The beginning of the section on the "smaller R2 neurons" should clearly define what R2 is being discussed. Based on the response to previous reviewers, this R2 "signifies the proportion of neuronal activity variance explained by the linear encoding model, calculated using raw signals". This should be mentioned and made clear in the main text whenever this R2 is referred to.

      7) Various terms require clear definitions. The authors sometimes use vague terminology (e.g., "useless") without a clear definition. Similarly, discussions regarding dimensionality could benefit from more precise definitions. How is neural dimensionality defined? For example, how is "neural dimensionality of specific behaviors" (line 590) defined? Related to this, I agree with Reviewer 2 that a clear definition of irrelevant should be mentioned that clarifies that relevance is roughly taken as "correlated or predictive with a fixed time lag". The analyses do not explore relevance with arbitrary time lags between neural and behavior data.

      8) CEBRA itself doesn't provide a neural reconstruction from its embeddings, but one could obtain one via a regression from extracted CEBRA embeddings to neural data. In addition to decoding results of CEBRA (figure S3), the neural reconstruction of CEBRA should be computed and CEBRA should be added to Figure 2 to see how the behaviorally relevant and irrelevant signals from CEBRA compare to other methods.

      References:

      Kobak, Dmitry, Wieland Brendel, Christos Constantinidis, Claudia E Feierstein, Adam Kepecs, Zachary F Mainen, Xue-Lian Qi, Ranulfo Romo, Naoshige Uchida, and Christian K Machens. 2016. "Demixed Principal Component Analysis of Neural Population Data." Edited by Mark CW van Rossum. eLife 5 (April): e10989. https://doi.org/10.7554/eLife.10989.

      Sani, Omid G., Hamidreza Abbaspourazad, Yan T. Wong, Bijan Pesaran, and Maryam M. Shanechi. 2021. "Modeling Behaviorally Relevant Neural Dynamics Enabled by Preferential Subspace Identification." Nature Neuroscience 24 (1): 140-49. https://doi.org/10.1038/s41593-020-00733-0.

      Zhou, Ding, and Xue-Xin Wei. 2020. "Learning Identifiable and Interpretable Latent Models of High-Dimensional Neural Activity Using Pi-VAE." In Advances in Neural Information Processing Systems, 33:7234-47. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2020/hash/510f2318f324cf07fce24c3a4b89c771-Abstract.html.

      Hurwitz, Cole, Akash Srivastava, Kai Xu, Justin Jude, Matthew Perich, Lee Miller, and Matthias Hennig. 2021. "Targeted Neural Dynamical Modeling." In Advances in Neural Information Processing Systems. Vol. 34. https://proceedings.neurips.cc/paper/2021/hash/f5cfbc876972bd0d031c8abc37344c28-Abstract.html.

      Balzani, Edoardo, Jean-Paul G. Noel, Pedro Herrero-Vidal, Dora E. Angelaki, and Cristina Savin. 2023. "A Probabilistic Framework for Task-Aligned Intra- and Inter-Area Neural Manifold Estimation." In . https://openreview.net/forum?id=kt-dcBQcSA.

    1. eLife assessment

      The study makes a valuable empirical contribution to our understanding of visual processing in primates and deep neural networks, with a specific focus on the concept of factorization. The analyses provide solid evidence that high factorization scores are correlated with neural predictivity, yet more evidence would be needed to show that neural responses show factorization. Consequently, while several aspects require further clarification, in its current form this work is interesting to systems neuroscientists studying vision and could inspire further research that ultimately may lead to better models of or a better understanding of the brain.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The paper investigates visual processing in primates and deep neural networks (DNNs), focusing on factorization in the encoding of scene parameters. It challenges the conventional view that object classification is the primary function of the ventral visual stream, suggesting instead that the visual system employs a nuanced strategy involving both factorization and invariance. The study also presents empirical findings suggesting a correlation between high factorization scores and good neural predictivity.

      Strengths:

      1. Novel Perspective: The paper introduces a fresh viewpoint on visual processing by emphasizing the factorization of non-class information.

      2. Methodology: The use of diverse datasets from primates and humans, alongside various computational models, strengthens the validity of the findings.

      3. Detailed Analysis: The paper suggests metrics for factorization and invariance, contributing to a future understanding & measurements of these concepts.

      Weaknesses:

      1. Vagueness (Perceptual or Neural Invariance?): The paper uses the term 'invariance', typically referring to perceptual stability despite stimulus variability [1], as the complete discarding of nuisance information in neural activity. This oversimplification overlooks the nuanced distinction between perceptual invariance (e.g., invariant object recognition) and neural invariance (e.g., no change in neural activity). It seems that by 'invariance' the authors mean 'neural' invariance (rather than 'perceptual' invariance) in this paper, which is vague. The paper could benefit from changing what is called 'invariance' in the paper to 'neural invariance' and distinguish it from 'perceptual invariance,' to avoid potential confusion for future readers. The assignment of 'compact' representation to 'invariance' in Figure 1A is misleading (although it can be addressed by the clarification on the term invariance). [1] DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends in cognitive sciences. 2007 Aug 1;11(8):333-41.

      2. Details on Metrics: The paper's explanation of factorization as encoding variance independently or uncorrelatedly needs more justification and elaboration. The definition of 'factorization' in Figure 1B seems to be potentially misleading, as the metric for factorization in the paper seems to be defined regardless of class information (can be defined within a single class). Does the factorization metric as defined in the paper (orthogonality of different sources of variation) warrant that responses for different object classes are aligned/parallel like in 1B (middle)? More clarification around this point could make the paper much richer and more interesting.

      3. Factorization vs. Invariance: Is it fair to present invariance vs. factorization as mutually exclusive options in representational hypothesis space? Perhaps a more fair comparison would be factorization vs. object recognition, as it is possible to have different levels of neural variability (or neural invariance) underlying both factorization and object recognition tasks.

      4. Potential Confounding Factors in Empirical Findings: The correlation observed in Figure 3 between factorization and neural predictivity might be influenced by data dimensionality, rather than factorization per se [2]. Incorporating discussions around this recent finding could strengthen the paper.

      [2] Elmoznino E, Bonner MF. High-performing neural network models of the visual cortex benefit from high latent dimensionality. bioRxiv. 2022 Jul 13:2022-07.

      Conclusion:<br /> The paper offers insightful empirical research with useful implications for understanding visual processing in primates and DNNs. The paper would benefit from a more nuanced discussion of perceptual and neural invariance, as well as a deeper discussion of the coexistence of factorization, recognition, and invariance in neural representation geometry. Additionally, addressing the potential confounding factors in the empirical findings on the correlation between factorization and neural predictivity would strengthen the paper's conclusions.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The dominant paradigm in the past decade for modeling the ventral visual stream's response to images has been to train deep neural networks on object classification tasks and regress neural responses from units of these networks. While object classification performance is correlated to the variance explained in the neural data, this approach has recently hit a plateau of variance explained, beyond which increases in classification performance do not yield improvements in neural predictivity. This suggests that classification performance may not be a sufficient objective for building better models of the ventral stream. Lindsey & Issa study the role of factorization in predicting neural responses to images, where factorization is the degree to which variables such as object pose and lighting are represented independently in orthogonal subspaces. They propose factorization as a candidate objective for breaking through the plateau suffered by models trained only on object classification. They claim that (i) maintaining these non-class variables in a factorized manner yields better neural predictivity than ignoring non-class information entirely, and (ii) factorization may be a representational strategy used by the brain.

      The first of these claims is supported by their data. The second claim does not seem well-supported, and the usefulness of their observations is not entirely clear.

      Strengths:<br /> This paper challenges the dominant approach to modeling neural responses in the ventral stream, which itself is valuable for diversifying the space of ideas.

      This paper uses a wide variety of datasets, spanning multiple brain areas and species. The results are consistent across the datasets, which is a great sign of robustness.

      The paper uses a large set of models from many prior works. This is impressively thorough and rigorous.

      The authors are very transparent, particularly in the supplementary material, showing results on all datasets. This is excellent practice.

      Weaknesses:<br /> 1. The primary weakness of this paper is a lack of clarity about what exactly is the contribution. I see two main interpretations: (1-A) As introducing a heuristic for predicting neural responses that improve over-classification accuracy, and (1-B) as a model of the brain's representational strategy. These two interpretations are distinct goals, each of which is valuable. However, I don't think the paper in its current form supports either of them very well:

      (1-A) Heuristic for neural predictivity. The claim here is that by optimizing for factorization, we could improve models' neural predictivity to break through the current predictivity plateau. To frame the paper in this way, the key contribution should be a new heuristic that correlates with neural predictivity better than classification accuracy. The paper currently does not do this. The main piece of evidence that factorization may yield a more useful heuristic than classification accuracy alone comes from Figure 5. However, in Figure 5 it seems that factorization along some factors is more useful than others, and different linear combinations of factorization and classification may be best for different data. There is no single heuristic presented and defended. If the authors want to frame this paper as a new heuristic for neural predictivity, I recommend the authors present and defend a specific heuristic that others can use, e.g. [K * factorization_of_pose + classification] for some constant K, and show that (i) this correlates with neural predictivity better than classification alone, and (ii) this can be used to build models with higher neural predictivity. For (ii), they could fine-tune a state-of-the-art model to improve this heuristic and show that doing so achieves a new state-of-the-art neural predictivity. That would be convincing evidence that their contribution is useful.

      (1-B) Model of representation in the brain. The claim here is that factorization is a general principle of representation in the brain. However, neural predictivity is not a suitable metric for this, because (i) neural predictivity allows arbitrary linear decoders, hence is invariant to the orthogonality requirement of factorization, and (ii) neural predictivity does not match the network representation to the brain representation. A better metric is representational dissimilarity matrices. However, the RDM results in Figure S4 actually seem to show that factorization does not do a very good job of predicting neural similarity (though the comparison to classification accuracy is not shown), which suggests that factorization may not be a general principle of the brain. If the authors want to frame the paper in terms of discovering a general principle of the brain, I suggest they use a metric (or suite of metrics) of brain similarity that is sensitive to the desiderata of factorization, e.g. doesn't apply arbitrary linear transformations, and compare to classification accuracy in addition to invariance.

      Overall, I suggest the authors clarify exactly what their claim is, then focus on that claim and present results to justify it. If neither of the claims above can be supported by evidence, then this paper still has value as an idea that they spent effort trying to test, but they should not suggest these claims in the paper. In that case, it may also be possible to increase the value of the contribution by characterizing how the structure of class-free variable representations impacts correlation with neural fit, instead of just comparing existence vs absence (invariance) of this information. For example, evaluate the degree to which local or global orthogonality matters, or the degree to which curvature of the embedding matters.

      2. I think the comparison to invariance, which is pervasive throughout the paper, is not very informative. First, it is not surprising that invariance is more weakly correlated with neural predictivity than factorization, because invariant representations lose information compared to factorized representations. Second, there has long been extensive evidence that responses throughout the ventral stream are not invariant to the factors the authors consider, so we already knew that invariance is not a good characterization of ventral stream data.

      3. The formalization of the factorization metric is not particularly elegant, because it relies on computing top K principal components for the other-parameter space, where K is arbitrarily chosen as 10. While the authors do show that in their datasets the results are not very sensitive to K (Figure S5), that is not guaranteed to be the case in general. I suggest the authors try to come up with a formalization that doesn't have arbitrary constants. For example, one possibility that comes to mind is E[delta_a x delta_b], where 'x' is the normalized cross product, delta_a, and delta_b are deltas in representation space induced by perturbations of factors a and b, and the expectation is taken over all base points and deltas. This is just the first thing that comes to mind, and I'm sure the authors can come up with something better. The literature on disentangling metrics in machine learning may be useful for ideas on measuring factorization.

      4. The authors defined the term "factorization" according to their metric. I think introducing this new term is not necessary and can be confusing because the term "factorization" is vague and used by different researchers in different ways. Perhaps a better term is "orthogonality", because that is clear and seems to be what the authors' metric is measuring.

      5. One general weakness of the factorization paradigm is the reliance on a choice of factors. This is a subjective choice and becomes an issue as you scale to more complex images where the choice of factors is not obvious. While this choice of factors cannot be avoided, I suggest the authors add two things: First, an analysis of how sensitive the results are to the choice of factors (e.g. transform the basis set of factors and re-run the metric); second, include some discussion about how factors may be chosen in general (e.g. based on temporal statistics of the world, independent components analysis, or something else).

    4. Reviewer #3 (Public Review):

      Summary:<br /> Object classification serves as a vital normative principle in both the study of the primate ventral visual stream and deep learning. Different models exhibit varying classification performances and organize information differently. Consequently, a thriving research area in computational neuroscience involves identifying meaningful properties of neural representations that act as bridges connecting performance and neural implementation. In the work of Lindsey and Issa, the concept of factorization is explored, which has strong connections with emerging concepts like disentanglement [1,2,3] and abstraction [4,5]. Their primary contributions encompass two facets: (1) The proposition of a straightforward method for quantifying the degree of factorization in visual representations. (2) A comprehensive examination of this quantification through correlation analysis across deep learning models.

      To elaborate, their methodology, inspired by prior studies [6], employs visual inputs featuring a foreground object superimposed onto natural backgrounds. Four types of scene variables, such as object pose, are manipulated to induce variations. To assess the level of factorization within a model, they systematically alter one of the scene variables of interest and estimate the proportion of encoding variances attributable to the parameter under consideration.

      The central assertion of this research is that factorization represents a normative principle governing biological visual representation. The authors substantiate this claim by demonstrating an increase in factorization from macaque V4 to IT, supported by evidence from correlated analyses revealing a positive correlation between factorization and decoding performance. Furthermore, they advocate for the inclusion of factorization as part of the objective function for training artificial neural networks. To validate this proposal, the authors systematically conduct correlation analyses across a wide spectrum of deep neural networks and datasets sourced from human and monkey subjects. Specifically, their findings indicate that the degree of factorization in a deep model positively correlates with its predictability concerning neural data (i.e., goodness of fit).

      Strengths:<br /> The primary strength of this paper is the authors' efforts in systematically conducting analysis across different organisms and recording methods. Also, the definition of factorization is simple and intuitive to understand.

      Weaknesses:<br /> This work exhibits two primary weaknesses that warrant attention: (i) the definition of factorization and its comparison to previous, relevant definitions, and (ii) the chosen analysis method.

      Firstly, the definition of factorization presented in this paper is founded upon the variances of representations under different stimuli variations. However, this definition can be seen as a structural assumption rather than capturing the effective geometric properties pertinent to computation. More precisely, the definition here is primarily statistical in nature, whereas previous methodologies incorporate computational aspects such as deviation from ideal regressors [1], symmetry transformations [3], generalization [5], among others. It would greatly enhance the paper's depth and clarity if the authors devoted a section to comparing their approach with previous methodologies [1,2,3,4,5], elucidating any novel insights and advantages stemming from this new definition.

      Secondly, in order to establish a meaningful connection between factorization and computation, the authors rely on a straightforward synthetic model (Figure 1c) and employ multiple correlation analyses to investigate relationships between the degree of factorization, decoding performance, and goodness of fit. Nevertheless, the results derived from the synthetic model are limited to the low training-sample regime. It remains unclear whether the biological datasets under consideration fall within this low training-sample regime or not.

      [1] Eastwood, Cian, and Christopher KI Williams. "A framework for the quantitative evaluation of disentangled representations." International conference on learning representations. 2018.<br /> [2] Kim, Hyunjik, and Andriy Mnih. "Disentangling by factorising." International Conference on Machine Learning. PMLR, 2018.<br /> [3] Higgins, Irina, et al. "Towards a definition of disentangled representations." arXiv preprint arXiv:1812.02230 (2018).<br /> [4] Bernardi, Silvia, et al. "The geometry of abstraction in the hippocampus and prefrontal cortex." Cell 183.4 (2020): 954-967.<br /> [5] Johnston, W. Jeffrey, and Stefano Fusi. "Abstract representations emerge naturally in neural networks trained to perform multiple tasks." Nature Communications 14.1 (2023): 1040.<br /> [6] Majaj, Najib J., et al. "Simple learned weighted sums of inferior temporal neuronal firing rates accurately predict human core object recognition performance." Journal of Neuroscience 35.39 (2015): 13402-13418.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study examines the cortical modular functional organization of visual texture in comparison with that of color and disparity. While color, disparity, and orientation have been shown to exhibit clear functional organizations within the thin, thick, and thick/pale stripes of V2, whether the feature of texture is also organized within V2 is unknown. Using ultrahigh field 7T fMRI in humans viewing color-, disparity-, and texture-specific visual stimuli, the authors find that, unlike color and disparity, texture does not exhibit stripe-specific organization in V2. Moreover, using laminar imaging methods and calculations of informational connectivity, they find V2 color and disparity stripes exhibit the expected feedforward and feedback relationships with V1 & V4, and with V1 & V3ab, respectively. In contrast, texture activation, found predominantly in the deep layers of V2, is driven preferentially by feedback from V4. Based on these findings, the authors suggest that texture is a visual feature computed in higher-order areas and not generated by local intra-V2 computation.

      Strengths:<br /> This study poses an interesting and fundamental question regarding the relationship between functional modularity and the hierarchical origin of computed properties. This question is thus highly significant and deserves study. The methodology is appropriate for the question and the areal and laminar resolution achieved across 10 subjects is commendable. The combination of high-resolution functional imaging and informational connectivity analysis introduces a useful way for examining feedforward and feedback relationships in mesoscale imaging data.

      Weaknesses:<br /> While the data are suggestive, further controls are needed.

      To support the finding that texture is not represented in a modular fashion, additional possibilities must be considered. These include the effectiveness and specificity of the texture stimulus and control stimuli, (b) further analysis of possible structure in images that may have been missed, and (c) limitations of imaging resolution.

      More in-depth analysis of subject data is needed. The apparent structure in the texture images in peripheral fields of some subjects calls for more detailed analysis. e.g Relationship to eccentricity and the need for a 'modularity index' to quantify the degree of modularity. A possible relationship to eccentricity should also be considered.

      Given what is known as a modular organization in V4 and V3 (e.g. for color, orientation, curvature), did images reveal these organizations? If so, connectivity analysis would be improved based on such ROIs. This would further strengthen the hierarchical scheme.

    2. Reviewer #2 (Public Review):

      High-resolution functional magnetic resonance imaging (fMRI) at ultra-high magnetic field strengths (7 T and above) can potentially study cortical functioning at the mesoscopic scale, i.e., at the spatial scale of cortical columns and layers. The authors of the study entitled "Mesoscale functional organization and connectivity of color, disparity, and naturalistic texture in human second visual area" remarkably show the current possibilities of high-resolution fMRI methods by studying the columnar and laminar organization for the processing of color, binocular disparity, and naturalistic texture in human secondary visual cortex (V2).

      The study could robustly show color-selective and disparity-selective stripes in human V2. While this was already demonstrated in several in vivo studies using fMRI (Nasr et al., 2016, J Neurosci, 36, 1841-1857; Dumoulin et al., 2017, Sci Rep, 7, 733; Tootell et al., 2021, Cereb Cortex, 31, 1163-1181; Navarro et al., 2021, NeuroImage, 225, 117520; Kennedy et al., 2023, Prog Neurobiol, 220, 102374; Haenelt et al., 2023, eLife, 12, e78756), the strength, in my opinion, of the current study is three-fold:

      1. Previous studies mainly focused on the columnar architecture of the stripe architecture in V2, neglecting any information across cortical depth. This study included a laminar analysis, which showcases the current possibilities of high-resolution fMRI methods that target the cortical local circuitry at the mesoscopic level.

      2. The successful mapping of color-selective and disparity-selective stripes in V2 was corroborated by an innovative connectivity analysis, which shows the expected higher connectivity of color-selective clusters in V2 with area V4 and binocular disparity with area V3ab.

      3. Furthermore, in addition to color-selective and disparity-selective stripes in V2 that were already shown in several studies at the columnar level (but without a laminar analysis), this study included naturalistic textures and analyzed the mesoscopic processing in V2. As expected, they showed greater sensitivity for texture selectivity in higher-order areas such as V4 and V3ab. In addition, due to the laminar analysis, feedforward and feedback connectivity were shown to be differentiable, demonstrating that feedback processes from higher-order areas rather drive texture processing in V2.

      Overall, the study shows interesting results that are valuable for the general neuroscientific community. In addition, the manuscript is understandable and clearly written.

      However, a few points might be worth discussing:

      1. In lines 162-163, it is stated that no clear columnar organization exists for naturalistic texture processing in V2. In my opinion, this should be rephrased. As far as I understand, Figure 2B refers to the analysis used to support the conclusion. The left and middle bar plots only show a circular analysis since ROIs were based on the color and disparity contrast used to define thin and thick stripes. The interesting graph is the right plot, which shows no statistically significant overlap of texture processing with thin, thick, and pale stripe ROIs. It should be pointed out that this analysis does not dismiss a columnar organization per se but instead only supports the conclusion of no coincidence with the CO-stripe architecture.

      2. In Figure 3, cortical depth-dependent analyses are presented for color, disparity, and texture processing. I acknowledge that the authors took care of venous effects by excluding outlier voxels. However, the GE-BOLD signal at high magnetic fields is still biased to extravascular contributions from around larger veins. Therefore, the highest color selectivity in superficial layers might also result from the bias to draining veins and might not be of neuronal origin. Furthermore, it is interesting that cortical profiles with the highest selectivity in superficial layers show overall higher selectivity across cortical depth. Could the missing increase toward the pial surface in other profiles result from the ROI definition or overall smaller signal changes (effect size) of selected voxels? At least, a more careful interpretation and discussion would be helpful for the reader.

      3. I was slightly surprised that no retinotopy data was acquired. The ROI definition in the manuscript was based on a retinotopy atlas plus manual stripe segmentation of single columns. Both steps have disadvantages because they neglect individual differences and are based on subjective assessment. A few points might be worth discussing: (1) In lines 467-468, the authors state that V2 was defined based on the extent of stripes. This classical definition of area V2 was questioned by a recent publication (Nasr et al., 2016, J Neurosci, 36, 1841-1857), which showed that stripes might extend into V3. Could this have been a problem in the present analysis, e.g., in the connectivity analysis? (2) The manual segmentation depends on the chosen threshold value, which is inevitably arbitrary. Which value was used?

      4. The use of 1-mm isotropic voxels is relatively coarse for cortical depth-dependent analyses, especially in the early visual cortex, which is highly convoluted and has a small cortical thickness. For example, most layer-fMRI studies use a voxel size of around isotropic 0.8 mm, which has half the voxel volume of 1 mm isotropic voxels. With increasing voxel volume, partial volume effects become more pronounced. For example, partial volume with CSF might confound the analysis by introducing pulsatility effects.

      5. The SVM analysis included a feature selection step stated in lines 531-533. Although this step is reasonable for the training of a machine learning classifier, it would be interesting to know if the authors think this step could have reintroduced some bias to remaining draining vein contributions.

    3. Reviewer #3 (Public Review):

      Summary:<br /> Ai et al. studied texture, color, and disparity selectivity in the human visual cortex at the mesoscale level using high-resolution fMRI. They reproduced earlier monkey and human studies showing interdigitated color-selective and disparity-selective sub-compartments within area V2, likely corresponding to thin and thick stripes, respectively. At least with the stimuli used, no clear evidence for texture-selective mesoscale activations was observed in area V2. The most interesting and novel part of this study focused on cortical-depth-dependent connectivity analyses across areas. The data suggest feedback and feedforward functional connectivity between V1 and V3A for disparity signals and feedback from V4 to the deep layers of V2 for textures.

      Strengths:<br /> High-resolution fMRI and highly interesting layer-specific informational connectivity analyses.

      Weaknesses:<br /> The authors tend to overclaim their results.

    1. eLife assessment

      This paper provides valuable insights into the neural substrates of human working memory. Through clever experimental design and rigorous analyses, the paper provides compelling evidence that the working memory representation of stimulus orientation is a reformatted version of the presented stimulus, reflecting the content that is of importance to the task. This work will be of broad interest to cognitive neuroscientists working on the neural bases of visual perception and memory.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors aim to test the sensory recruitment theory of visual memory, which assumes that visual sensory areas are recruited for working memory, and that these sensory areas represent visual memories in a similar fashion to how perceptual inputs are represented. To test the overlap between working memory (WM) and perception, the authors use coarse stimulus (aperture) biases that are known to account for (some) orientation decoding in the visual cortex (i.e., stimulus energy is higher for parts of an image where a grating orientation is perpendicular to an aperture edge, and stimulus energy drives decoding). Specifically, the authors show gratings (with a given "carrier" orientation) behind two different apertures: one is a radial modulator (with maximal energy aligned with the carrier orientation) and the other an angular modulator (with maximal energy orthogonal to the carrier orientation). When the subject detects contrast changes in these stimuli (the perceptual task), orientation decoding only works when training and testing within each modulator, but not across modulators, showing the impact of stimulus energy on decoding performance. Instead, when subjects remember the orientation over a 12s delay, orientation decoding works irrespective of the modulator used. The authors conclude that representations during WM are therefore not "sensory-like", given that they are immune to aperture biases. This invalidates the sensory recruitment hypothesis, or at least the part assuming that when sensory areas are recruited during WM, they are recruited in a manner that resembles how these areas are used during perception.

      Strengths:<br /> Duan and Curtis very convincingly show that aperture effects that are present during perception, do not appear to be present during the working memory delay. Especially when the debate about "why can we decode orientations from human visual cortex" was in full swing, many may have quietly assumed this to be true (e.g., "the memory delay has no stimuli, and ergo no stimulus aperture effects"), but it is definitely not self-evident and nobody ever thought to test it directly until now. In addition to the clear absence of aperture effects during the delay, Duan and Curtis also show that when stimulus energy aligns with the carrier orientation, cross-generalization between perception and memory does work (which could explain why perception-to-memory cross-decoding also works). All in all, this is a clever manipulation, and I'm glad someone did it, and did it well.

      Weaknesses:<br /> There seems to be a major possible confound that prohibits strong conclusions about "abstractions" into "line-like" representation, which is spatial attention. What if subjects simply attend the endpoints of the carrier grating, or attend to the edge of the screen where the carrier orientation "intersects" in order to do the task? This may also result in reconstructions that have higher bold at areas close to the stimulus/screen edges along the carrier orientation. The question then would be if this is truly an "abstracted representation", or if subjects are merely using spatial attention to do the task.

      Alternatively (and this reaches back to the "fine vs coarse" debate), another argument could be that during memory, what we are decoding is indeed fine-scale inhomogenous sampling of orientation preferences across many voxels. This is clearly not the most convincing argument, as the spatial reconstructions (e.g., Figure 3A and C) show higher BOLD for voxels with receptive fields that are aligned to the remembered orientation (which is in itself a form of coarse-scale bias), but could still play a role.

      To conclude that the spatial reconstruction from the data indeed comes from a line-like representation, you'd need to generate modeled reconstructions of all possible stimuli and representations. Yes, Figure 4 shows that line results in a modeled spatial map that resembles the WM data, but many other stimuli might too, and some may better match the data. For example, the alternative hypothesis (attention to grating endpoints) may very well lead to a very comparable model output to the one from a line. However testing this would not suffice, as there may be an inherent inverse problem (with multiple stimuli that can lead to the same visual field model).

      The main conclusion, and title of the paper, that visual working memories are abstractions of percepts, is therefore not supported. Subjects could be using spatial attention, for example. Furthermore, even if it is true that gratings are abstracted into lines, this form of abstraction would not generalize to any non-spatial feature (e.g., color cannot become a line, contrast cannot become a line, etc.), which means it has limited explanatory power.

      Additional context:<br /> The working memory and perception tasks are rather different. In this case, the perception task does not require the subject to process the carrier orientation (which is largely occluded, and possibly not that obvious without paying attention to it), but attention is paid to contrast. In this scenario, stimulus energy may dominate the signal. In the WM task, subjects have to work out what orientation is shown to do the task. Given that the sensory stimulus in both tasks is brief (1.5s during memory encoding, and 2.5s total in the perceptual task), it would be interesting to look at decoding (and reconstructions) for the WM stimulus epoch. If abstraction (into a line) happens in working memory, then this perceptual part of the task should still be susceptible to aperture biases. It allows the authors to show that it is indeed during memory (and not merely the task or attentional state of the subject) that abstraction occurs.

      What's also interesting is what happens in the passive perceptual condition, and the fact that spatial reconstructions for areas beyond V1 and V2 (i.e., V3, V3AB, and IPS0-1) align with (implied) grating endpoints, even when an angular modulator is used (Figure 3C). Are these areas also "abstracting" the stimulus (in a line-like format)?

    3. Reviewer #2 (Public Review):

      Summary:<br /> According to the sensory recruitment model, the contents of working memory (WM) are maintained by activity in the same sensory cortical regions responsible for processing perceptual inputs. A strong version of the sensory recruitment model predicts that stimulus-specific activity patterns measured in sensory brain areas during WM storage should be identical to those measured during perceptual processing. Previous research casts doubt on this hypothesis, but little is known about how stimulus-specific activity patterns during perception and memory differ. Through clever experimental design and rigorous analyses, Duan & Curtis convincingly demonstrate that stimulus-specific representations of remembered items are highly abstracted versions of representations measured during perceptual processing and that these abstracted representations are immune to aperture biases that contribute to fMRI feature decoding. The paper provides converging evidence that neural states responsible for representing information during perception and WM are fundamentally different, and provides a potential explanation for this difference.

      Strengths:<br /> 1. The generation of stimuli with matching vs. orthogonal orientations and aperture biases is clever and sets up a straightforward test regarding whether and how aperture biases contribute to orientation decoding during perception and WM. The demonstration that orientation decoding during perception is driven primarily by aperture bias while during WM it is driven primarily by orientation is compelling.

      2. The paper suggests a reason why orientation decoding during WM might be immune to aperture biases: by weighting multivoxel patterns measured during WM storage by spatial population receptive field estimates from a different task the authors show that remembered - but not actively viewed - orientations form "line-like" patterns in retinotopic cortical space.

      Weaknesses:<br /> 1. The paper tests a strong version of the sensory recruitment model, where neural states representing information during WM are presumed to be identical to neural states representing the same information during perceptual processing. As the paper acknowledges, there is already ample reason to doubt this prediction (see, e.g., earlier work by Kok & de Lange, Curr Biol 2014; Bloem et al., Psych Sci, 2018; Rademaker et al., Nat Neurosci, 2019; among others). Still, the demonstration that orientation decoding during WM is immune to aperture biases known to drive orientation decoding during perception makes for a compelling demonstration.

      2. Earlier work by the same group has reported line-like representations of orientations during memory storage but not during perception (e.g., Kwak & Curtis, Neuron, 2022). It's nice to see that result replicated during explicit perceptual and WM tasks in the current study, but I question whether the findings provide fundamental new insights into the neural bases of WM. That would require a model or explanation describing how stimulus-specific activation patterns measured during perception are transformed into the "line-like" patterns seen during WM, which the authors acknowledge is an important goal for future research.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In this work, Duan and Curtis addressed an important issue related to the nature of working memory representations. This work is motivated by findings illustrating that orientation decoding performance for perceptual representations can be biased by the stimulus aperture (modulator). Here, the authors examined whether the decoding performance for working memory representations is similarly influenced by these aperture biases. The results provide convincing evidence that working memory representations have a different representational structure, as the decoding performance was not influenced by the type of stimulus aperture.

      Strengths:<br /> The strength of this work lies in the direct comparison of decoding performance for perceptual representations with working memory representations. The authors take a well-motivated approach and illustrate that perceptual and working memory representations do not share a similar representational structure. The authors test a clear question, with a rigorous approach and provide convincing evidence. First, the presented oriented stimuli are carefully manipulated to create orthogonal biases introduced by the stimulus aperture (radial or angular modulator), regardless of the stimulus carrier orientation. Second, the authors implement advanced methods to decode the orientation information present, in visual and parietal cortical regions, when directly perceiving or holding an oriented stimulus in memory. The data illustrates that working memory decoding is not influenced by the type of aperture, while this is the case in perception. In sum, the main claims are important and shed light on the nature of working memory representations.

      Weaknesses:<br /> I have a few minor concerns that, although they don't affect the main conclusion of the paper, should still be addressed.

      1. Theoretical framing in the introduction: Recent work has shown that decoding of orientation during perception does reflect orientation selectivity, and it is not only driven by the stimulus aperture (Roth, Kay & Merriam, 2022).

      2. Figure 1C illustrates the principle of how the radial and angular modulators bias the contrast energy extracted by the V1 model, which in turn would influence orientation decoding. It would be informative if the carrier orientations used in the experiment were shown in this figure, or at a minimum it would be mentioned in the legend that the experiment used 3 carrier orientations (15{degree sign}, 75{degree sign}, 135{degree sign}) clockwise from vertical. Related, when trying to find more information regarding the carrier orientation, the 'Stimuli' section of the Methods incorrectly mentions that 180 orientations are used as the carrier orientation.

      3. The description of the image computable V1 model in the Methods is incomplete, and at times inaccurate. i) The model implements 6 orientation channels, which is inaccurately referred to as a bandwidth of 60{degree sign} (should be 180/6=30). ii) The steerable pyramid combines information across phase pairs to obtain a measure of contrast energy for a given stimulus.<br /> Here, it is only mentioned that the model contains different orientation and spatial scale channels. I assume there were also 2 phase pairs, and they were combined in some manner (squared and summed to create contrast energy). Currently, it is unclear what the model output represents. iii) The spatial scale channel with the maximal response differences between the 2 modulators was chosen as the final model output. What spatial frequency does this channel refer to, and how does this spatial frequency relate to the stimulus?

      4. It is not clear from the Methods how the difficulty in the perceptual control task was controlled. How were the levels of task difficulty created?

    1. Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic, and chemogenetic approaches, which present their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostrocaudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods, and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

    2. Reviewer #2 (Public Review):

      Summary:<br /> Pisanski and colleagues map regions of the brainstem that produce the rhythm for active expiratory breathing movements and influence their motor patterns. While the neural origins of inspiration are very well understood, the neural bases for expiration lag considerably. The problem is important and new knowledge pertaining to the neural origins of expiration is welcome.

      The authors perturb the parafacial lateral (pFL) respiratory group of the brainstem with microinjection of bicuculline, to elucidate how disinhibition in specific locations of the pFL influences active expiration (and breathing in general) in anesthetized rats. They provide valuable, if not definitive, evidence that the borders of the pFL appear to extend more rostrally than previously appreciated. Prior research suggests that the expiratory pFL exists at the caudal pole of the facial cranial nucleus (VIIc). Here, the authors show that its borders probably extend as much as 1 mm rostral to VIIc. The evidence is convincing albeit with caveats.

      Strengths:<br /> The authors achieve their aim in terms of showing that the borders of the expiratory pFL are not well understood at present and that it (the pFL) extends more rostrally. The results support that point. The data are strong enough to cause many respiratory neurobiologists to look at the sites rostral to the VIIc for expiratory rhythmogenic neurons and characterize their properties and mechanisms. At present my view is that most respiratory neurobiologists overlook the regions rostral to VIIc in their studies of expiratory rhythm and pattern.

      Weaknesses:<br /> The injection of bicuculline has indiscriminate effects on excitatory and inhibitory neurons, and the parafacial region is populated by excitatory neurons that are expiratory rhythmogenic and GABA and glycinergic neurons whose roles in producing active expiration are contradictory (Flor et al. J Physiol, 2020, DOI: 10.1113/JP280243). It remains unclear how the microinjections of bicuculline differentially affect all three populations. A more selective approach would be able to disinhibit the populations separately. Nevertheless, for the main point at hand, the data do suggest that we should reconsider the borders of the expiratory pFL nucleus and begin to examine its physiology up to 1 mm rostral to VIIc.

      The control experiment showed that bicuculline microinjections induced cFos expression in the pFL, which is good, but again we don't know which neurons were disinhibited: glutamatergic, GABAergic, or glycinergic.

      The manuscript characterizes how bicuculline microinjections affect breathing parameters such as tidal volume, frequency, ventilation, inspiratory and expiratory time, as well as oxygen consumption. Those aspects of the manuscript are a bit tedious and sometimes overanalyzed. Plus, there was no predictive framework established at the outset for how one should expect disinhibition to affect breathing parameters. In other words, if the authors are seeking to map the pFL borders, then why analyze the breathing patterns so much? Does doing so provide more insight into the borders of pFL? I did not think it was compellingly argued.

      Further, lines 382-386 make a point about decreasing inspiratory time even though the data do not meet the statistical threshold.

      In lines 386-395, the reporting appears to reach significance (line 388) but not reach significance (line 389). I had trouble making sense of that disparity.

      The other statistical hiccups include "tended towards significance" (line 454), "were found to only reach significance for a short portion of the response" (line 486-7), "did not reach the level of significance" (line 506), which gives one the sense of cherry picking or over-analysis. Frankly, this reviewer finds the paper much more compelling when just asking whether the microinjections evoke active expiration. If yes, then the site is probably part of the pFL.

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostrocaudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:<br /> The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:<br /> Bicuculline injections increase the excitability of neurons. Aside from blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA current, thus insights into the role of GABAergic inhibition are limited.

      Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it provides valuable insights into this region's complex role in controlling breathing.

    1. eLife assessment

      Based on analyses of retinae from genetically modified mice, and from wild-type ground squirrel and macaque employing microscopic imaging, electrophysiology, and pharmacological manipulations, this useful study on the role of Cav1.4 calcium channels in cone photoreceptor cells (i) shows that the expression of a Cav1.4 variant lacking calcium conductivity supports the development of cone synapses beyond what is observed in the complete absence of Cav1.4, and (ii) indicates that the cone pathway can partially operate even without calcium flux through Cav1.4 channels, thus preserving behavioral responses under bright light. The evidence for the function of Cav1.4 protein in synapse development is convincing, and in agreement with a closely related earlier study by the same authors on rod photoreceptors, but the evidential support for the notion of a homeostatic compensation of Cav1.4 loss by Cav3 is incomplete. As congenital Cav1.4 dysfunction can cause stationary night blindness, this work relates to a wide range of neuroscience topics, from synapse biology to neuro-ophthalmology.

    2. Reviewer #1 (Public Review)

      Cav1.4 calcium channels control voltage-dependent calcium influx at photoreceptor synapses, and congenital loss of Cav1.4 function causes stationary night blindness CSNB2. Based on a broad portfolio of methodological approaches - genetic mouse models, immunolabeling and microscopic imaging, serial block-face-SEM, ERGs, and electrophysiology - the authors show that cone photoreceptor synapse development is strongly perturbed in the absence of Cav1.4 protein, and that expression of a nonconducting Cav1.4 channel mitigates these perturbations. Further data indicate that Cav3 channels are present, which, according to the authors, may compensate for the loss of Cav1.4 calcium currents and thus maintain cone synaptic transmission. These data, which are in agreement with a similar study by the same authors on rod photoreceptor synapses, help to explain what functional defects exactly cause CSNB2 and why it is accompanied by only mild visual impairment.

      The strengths of the present study are its conceptual and experimental soundness, the broad spectrum of cutting-edge methodological approaches pursued, and the convincing differential analysis of mutant phenotypes.

      Weaknesses mainly concern the experiments and arguments leading to the authors' notion that Cav3 channels may partially compensate for the loss of Cav1.4 calcium currents in cone synapses. It is possible that the non-conducting Cav1.4 variant supports synapse development and the Cav3 channel then provides the calcium influx. However, in its current state, the study does not unequivocally assess Cav3 expression in wild-type cones, it lacks direct evidence of Cav3 expression and upregulation, e.g. via single cell transcriptomics, immunolabeling, or an elaboration on electrophysiology, and it does not test the authors' earlier idea that Cav1.4 might couple to intracellular calcium stores at photoreceptor synapses

    3. Reviewer #2 (Public Review)

      Summary:<br /> This paper by Maddox et al. presents the results of a study of Ca channel function in mouse cone photoreceptor synaptic terminals. It builds on earlier work by the same authors (Maddox et al. 2020 in eLife) which demonstrated that a non-conducting but voltage-sensing variant of Cav1.4 (G369i knock-in, or KI) could substitute for WT Cav1.4 to promote relatively normal rod synapse development despite an inability to support Ca2+-dependent glutamatergic transmission to postsynaptic bipolar cells. Cav1.4 knock-out (KO) rod synapses, however, were completely disorganized, indicating that the presence of Cav1.4 protein is critical for synaptic organization. Here, the authors extend their study of the G369i-KI retina to demonstrate that G369i-KI cones develop working (though disrupted and sometimes aberrant) synapses that support some visual function owing to compensatory expression of Cav3-containing Ca channels that can mediate some Ca2+-dependent transmission from cones to postsynaptic cells. This compensatory expression of a low voltage-activated Ca conductance was not noted previously (Maddox et al. 2020) in G369i-KI rods.

      Strengths:<br /> In all, this is a scientifically sound study that shows obvious differences between synaptic terminal morphology and organization, macroscopic Ca currents, transmission to postsynaptic horizontal and bipolar cells (with whole-cell recording and ERG, respectively), and visually-guided behavior in experimental groups.

      Weaknesses:<br /> The major criticism that I have of the study is that it infers Ca channel molecular composition based solely on pharmacological analysis, which, as the authors note, is confounded by the cross-reactivity of many of the "specific" channel-type antagonists. The authors note that Cav3 mRNAs have been found in cones, but here, they do not perform any analysis to examine Cav3 transcript expression after G369i-KI nor do they examine Ca channel transcript expression in monkey or squirrel cones, which serve as controls of sorts for the G369i-KI (i.e. like WT mouse cones, cones of these other species do not seem to exhibit LVA Ca currents).

      Secondarily, in Maddox et al. 2020, the authors raise the possibility that G369i-KI, by virtue of having a functional voltage-sensing domain-might couple to intracellular Ca2+ stores, and it seems appropriate that this possibility be considered experimentally here.

      As a minor point: the authors might wish to note - in comparison to another retinal ribbon synapse-that Zhang et al. 2022 (in J. Neuroscience) performed a study of mouse rod bipolar cells found a number of LVA and HVA Ca conductances in addition to the typical L-type conductance mediated by Cav1-containing channels.

    4. Reviewer #3 (Public Review)

      Summary<br /> This is an important study that tests the hypothesis that Cav1.4 calcium channels do more than provide a voltage-dependent influx of Ca2+ into photoreceptors. The relevant background can be divided into two tranches. First, deletion of Cav1.4 channels (Cav1.4 knock-out) disrupts rod and cone photoreceptors and their synapses in the outer plexiform layer. Second, knock-in of a non-conducting Cav1.4 channel (Cav1.4 knock-in) partially spares the organization of the outer plexiform layer and photoreceptor synapses (Maddox et al., eLife 2020), which is remarkable considering the disruption of the outer plexiform layer in the Cav1.4 knock-out. In addition, phototransduction, assessed by scotopic and phototopic electroretinography (a-wave amplitude) in the Cav1.4 knock-in retina was partially spared for rods and only slightly impaired for cones. However, the non-conducting Cav1.4 channel of the Cav1.4 knock-in failed to rescue synaptic transmission across the outer retina (electroretinography: b-wave amplitude, Maddox et al., eLife 2020). The 2020 Maddox et al. (eLife) focused more on the rod pathway, while the current work addressed the cone pathway.

      Strengths<br /> The study addresses the important question of how disruption of Cav1.4 function in both rod and cone photoreceptors leads to impairment primarily of the rod pathway for scotopic vision. This is clinically relevant as human mutations lead to stationary night blindness rather than blindness. The work relevance provides excellent single-cell electrophysiological recordings of Ca2+ currents from cones of wild-type, Cav1.4 knock-out, and Cav1.4 knock-in mice and, in addition, from ground squirrel and monkey cones. To make these recordings successfully in the various species and the compromised retinas (Cav1.4 knock-out and Cav1.4 knock-in) is very impressive. The findings clearly advance our understanding of Ca2+ channel function in cones. In addition, the study presents high-quality electron microscopy reconstructions of cones and further physiological and behavioral data related to the cone pathway.

      Weaknesses<br /> The major critiques are related to the description of the Cav1.4 knock-in mouse as "sparing" function, which can be remedied in part by a simple rewrite, and in certain places, the data may need to be examined more critically. In particular, the authors should address features in the data presented in Figures 6 and 7 that seem to indicate that the retina of the Cav1.4 knock-in is not intact, but the interpretation given by the authors as "intact" is not appropriate and made without rigorous statistical testing.

    5. Reviewer #4 (Public Review)

      Summary:<br /> Cav1.4 voltage-gated calcium channels play an important role in neurotransmission at mammalian photoreceptor synapses. Mutations in the CACNA1f gene lead to congenital stationary night blindness that particularly affects the rod pathway. Mouse Cav1.4 knockout and Cav1.4 knockin models suggest that Cav1.4 is also important for the cone pathway. Deletion of Cav1.4 in the knockout models leads to signaling malfunctions and to abundant morphological re-arrangements of the synapse suggesting that the channel not only has a role in the influx of Ca2+ but also in the morphological organization of the photoreceptor synapse. Of note, also additional Cav-channels have been previously detected in cone synapses by different groups, including L-type Cav1.3 (Wu et al., 2007; pmid; Kersten et al., 2020; pmid), and also T-type Cav3.2 (Davison et al., 2021; pmid 35803735).

      In order to study a conductivity-independent role of Cav1.4 in the morphological organization of photoreceptor synapses, the authors generated the knockin (KI) mouse Cav1.4 G369i in a previous study (Maddox et al., eLife 2020; pmid 32940604). The Cav1.4 G369i KI channel no longer works as a Ca2+-conducting channel due to the insertion of a glycine in the pore-forming unit (Madox et al. elife 2020; pmid 32940604). In this previous study (Madox et al. elife 2020; pmid 32940604), the authors analyzed Cav1.4 G369i in rod photoreceptor synapses. In the present study, the authors analyzed cone synapses in this KI mouse.

      For this purpose, the authors performed a comprehensive set of experimental methods including immunohistochemistry with antibodies (also with quantitative analyses), electrophysiological measurements of presynaptic Ca2+ currents from cone photoreceptors in the presence/absence of inhibitors of L-type- and T-type- calcium channels, electron microscopy (FIB-SEM), ERG recordings and visual behavior tests of the Cav G369i KI in comparison to the Cav1.4 knockout and wild-type control mice.

      The authors found that the non-conducting Cav channel is properly localized in cone synapses and demonstrated that there are no gross morphological alterations (e.g., sprouting of postsynaptic components that are typically observed in the Cav1.4 knockout). These findings demonstrate that cone synaptogenesis relies on the presence Cav1.4 protein but not on its Ca2+ conductivity. This result, obtained at cone synapses in the present study, is similar to the previously reported results observed for rod synapses (Maddox et al., eLife 2020, pmid 32940604). No further mechanistic insights or molecular mechanisms were provided that demonstrated how the presence of the Cav channels could orchestrate the building of the cone synapse.

      Strengths:<br /> The study has been expertly performed. A comprehensive set of experimental methods including immunohistochemistry with antibodies (also with quantitative analyses), electrophysiological measurements of presynaptic Ca2+ currents from cone photoreceptors in the presence/absence of inhibitors of L-type- and T-type- calcium channels, electron microscopy (FIB-SEM), ERG recordings and visual behavior tests of the Cav G369i KI in comparison to the Cav1.4 knockout and wild-type control mice.

      Weaknesses:<br /> The study has been expertly performed but remains descriptive without deciphering the underlying molecular mechanisms of the observed phenomena, including the proposed homeostatic switch of synaptic calcium channels. Furthermore, a relevant part of the data in the present paper (presence of T-type calcium channels in cone photoreceptors) has already been identified/presented by previous studies of different groups (Macosko et al., 2015; pmid 26000488; Davison et al., 2021; pmid 35803735; Williams et al., 2022; pmid 35650675). The degree of novelty of the present paper thus appears limited.

    1. eLife assessment

      This paper provides a valuable method that uses a computational model to predict photoreceptor currents in mammalian photoreceptors. By inverting the model, visual stimuli can be constructed to produce desired photoreceptor current responses. The authors provide convincing evidence that this approach can disentangle the effects of photoreceptor nonlinearities including light adaptation from downstream nonlinear processing, thus facilitating future studies of the higher visual system.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript aims at a quantitative model of how visual stimuli, given as time-dependent light intensity signals, are transduced into electrical currents in photoreceptors of macaque and mouse retina. Based on prior knowledge of the fundamental biophysical steps of the transduction cascade and a relatively small number of free parameters, the resulting model is found to fairly accurately capture measured photoreceptor currents under a range of diverse visual stimuli and with parameters that are (mostly) identical for photoreceptors of the same type.

      Furthermore, as the model is invertible, the authors show that it can be used to derive visual stimuli that result in a desired, predetermined photoreceptor response. As demonstrated with several examples, this can be used to probe how the dynamics of phototransduction affect downstream signals in retinal ganglion cells, for example, by manipulating the visual stimuli in such a way that photoreceptor signals are linear or have reduced or altered adaptation. This innovative approach had already previously been used by the same lab to probe the contribution of photoreceptor adaptation to differences between On and Off parasol cells (Yu et al, eLife 2022), but the present paper extends this by describing and testing the photoreceptor model more generally and in both macaque and mouse as well as for both rods and cones.

      Strengths:<br /> The presentation of the model is thorough and convincing, and the ability to capture responses to stimuli as different as white noise with varying mean intensity and flashes with a common set of model parameters across cells is impressive. Also, the suggested approach of applying the model to modify visual stimuli that effectively alter photoreceptor signal processing is thought-provoking and should be a powerful tool for future investigations of retinal circuit function. The examples of how this approach can be applied are convincing and corroborate, for example, previous findings that adaptation to ambient light in the primate retina, as measured by responses to light flashes, mostly originates in photoreceptors.

      Weaknesses:<br /> In the current form of the presentation, it doesn't become fully clear how easily the approach is applicable at different mean light levels and where exactly the limits for the model inversion are at high frequency. Also, accessibility and applicability by others could be strengthened by including more details about how parameters are fixed and what consensus values are selected.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript proposes a modeling approach to capture nonlinear processes of photocurrents in mammalian (mouse, primate) rod and cone photoreceptors. The ultimate goal is to separate these nonlinearities at the level of photocurrent from subsequent nonlinear processing that occurs in retinal circuitry. The authors devised a strategy to generate stimuli that cancel the major nonlinearities in photocurrents. For example, modified stimuli would generate genuine sinusoidal modulation of the photocurrent, whereas a sinusoidal stimulus would not (i.e., because of asymmetries in the photocurrent to light vs. dark changes); and modified stimuli that could cancel the effects of light adaptation at the photocurrent level. Using these modified stimuli, one could record downstream neurons, knowing that any nonlinearities that emerge must happen post-photocurrent. This could be a useful method for separating nonlinear mechanisms across different stages of retinal processing, although there are some apparent limitations to the overall strategy.

      Strengths:<br /> 1. This is a very quantitative and thoughtful approach and addresses a long-standing problem in the field: determining the location of nonlinearities within a complex circuit, including asymmetric responses to different polarities of contrast, adaptation, etc.<br /> 2. The study presents data for two primary models of mammalian retina, mouse, and primate, and shows that the basic strategy works in each case.<br /> 3. Ideally, the present results would generalize to the work in other labs and possibly other sensory systems. How easy would this be? Would one lab have to be able to record both receptor and post-receptor neurons? Would in vitro recordings be useful for interpreting in vivo studies? It would be useful to comment on how well the current strategy could be generalized.

      Weaknesses:<br /> 1. The model is limited to describing photoreceptor responses at the level of photocurrents, as opposed to the output of the cell, which takes into account voltage-dependent mechanisms, horizontal cell feedback, etc., as the authors acknowledge. How would one distinguish nonlinearities that emerge at the level of post-photocurrent processing within the photoreceptor as opposed to downstream mechanisms? It would seem as if one is back to the earlier approach, recording at multiple levels of the circuit (e.g., Dunn et al., 2006, 2007).<br /> 2. It would have been nice to see additional confirmations of the approach beyond what is presented in Figure 9. This is limited by the sample (n = 1 horizontal cell) and the number of conditions (1). It would have been interesting to at least see the same test at a dimmer light level, where the major adaptation mechanisms are supposed to occur beyond the photoreceptors (Dunn et al., 2007).

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors propose to invert a mechanistic model of phototransduction in mouse and rod photoreceptors to derive stimuli that compensate for nonlinearities in these cells. They fit the model to a large set of photoreceptor recordings and show in additional data that the compensation works. This can allow the exclusion of photoreceptors as a source of nonlinear computation in the retina, as desired to pinpoint nonlinearities in retinal computation. Overall, the recordings made by the authors are impressive and I appreciate the simplicity and elegance of the idea. The data support the authors' conclusions but the presentation can be improved.

      Strengths:<br /> - The authors collected an impressive set of recordings from mouse and primate photoreceptors, which is very challenging to obtain.<br /> - The authors propose to exploit mechanistic mathematical models of well-understood phototransduction to design light stimuli that compensate for nonlinearities.<br /> - The authors demonstrate through additional experiments that their proposed approach works.

      Weaknesses:<br /> - The authors use numerical optimization for fitting the parameters of the photoreceptor model to the data. Recently, the field of simulation-based inference has developed methods to do so, including quantification of the uncertainty of the resulting estimates. Since the authors state that two different procedures were used due to the different amounts of data collected from different cells, it may be worthwhile to rather test these methods, as implemented e.g. in the SBI toolbox (https://joss.theoj.org/papers/10.21105/joss.02505). This would also allow them to directly identify dependencies between parameters, and obtain associated uncertainty estimates. This would also make the discussion of how well constrained the parameters are by the data or how much they vary more principled because the SBI uncertainty estimates could be used.

      - In several places, the authors refer the reader to look up specific values e.g. of parameters in the associated MATLAB code. I don't think this is appropriate, important values/findings/facts should be in the paper (lines 142, 114, 168). I would even find the precise values that the authors measure interesting, so I think the authors should show them in a figure/table. In general, I would like to see also the average variance explained by different models summarized in a table and precise mean/median values for all important quantities (like the response amplitude ratios in Figures 6/9).

      - If the proposed model is supposed to model photoreceptor adaptation on a longer time scale, I fail to see why this can be an invertible model. Could the authors explain this better? I suspect that the model is mainly about nonlinearities as the authors also discuss in lines 360ff.

      - The important Figures 6-8 are very hard to read, as it is not easy to see what the stimulus is, the modified stimulus, the response with and without modification, what the desired output looks like, and what is measured for part B. Reworking these figures would be highly recommended.

      - If I understand Figure 6 correctly, part B is about quantifying the relative size of the response to the little first flash to the little second flash. While clearly, the response amplitude of the second flash is only 50% for the second flash compared to the first flash in primate rod and cones in the original condition, the modified stimulus seems to overcompensate and result in 130% response for the second flash. How do the authors explain this? A similar effect occurs in Figure 9, which the authors should also discuss.

    1. eLife assessment

      In this important work, authors show that brain activity thought to be a travelling wave may just be a series of sequentially activated sources at the neuron spiking level. They support this with convincing results from a turtle cortex preparation and relevant simulations. This work will be of interest to neuroscientists interested in understanding how cortical computations are made.

    2. Joint Public Review:

      Summary:

      In this interesting work, the authors investigated an important topical question: when we see travelling waves in cortical activity, is this due to true wave-like spread, or due to sequentially activated sources? In simulations, it is shown that sequential brain module activation can show up as a travelling wave - even in improved methods such as phase delay maps - and a variety of parameters is investigated. Then, in ex-vivo turtle eye-brain preparations, the authors show that visual cortex waves observable in local field potentials are in fact often better explained as areas D1 and D2 being sequentially activated. This has implications for how we think about travelling wave methodology and relevant analytical tools.

      Strengths:

      I enjoyed reading the discussion. The authors are careful in their claims, and point out that some phenomena may still indeed be genuine travelling waves, but we should have a higher evidence bar to claim this for a particular process in light of this paper and Zhigalov & Jensen (2023) (ref 44). Given this careful discussion, the claims made are well-supported by the experimental results. The discussion also gives a nice overview of potential options in light of this and future directions.

      The illustration of different gaussian covariances leading to very different latency maps was interesting to see.

      Furthermore, the methods are detailed and clearly structured and the Supplementary Figures, particularly single trial results, are useful and convincing.

  2. Jan 2024
    1. eLife assessment

      This fundamental study advances our understanding of nitrogen metabolism by identifying a new type of guanidine-forming enzyme in eukaryotes. The key claims of the article are convincingly supported by the data, with meticulous biochemical, cellular, and in vivo studies on guanidine production. The work will stimulate interest in the cellular roles of homoarginine, and, more generally, in the biochemistry and metabolism of guanidine derivatives.

    2. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Nitrogen metabolism is of fundamental importance to biology. However, the metabolism and biochemistry of guanidine and guanidine containing compounds, including arginine and homoarginine, have been understudied over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new type of guanidine forming enzyme. It was previously known that 2-oxogluturate oxygenase catalysis in bacteria can produce guanidine via oxidation of arginine. Interestingly, the same enzyme that produces guanidine from arginine also oxidises 2-oxogluturate to give the plant signalling molecule ethylene. Funck et al show that a mechanistically related oxygenase enzyme from plants can also produce guanidine, but instead of using arginine as a substrate, it uses homoarginine. The work will stimulate interest in the cellular roles of homoarginine, a metabolite present in plants and other organisms including humans and, more generally, in the biochemistry and metabolism of guanidines.

      1) Significance

      Studies on the metabolism and biochemistry of the small nitrogen rich molecule guanidine and related compounds including arginine have been largely ignored over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new guanidine forming enzyme that works by oxidation of homoarginine, a metabolite present in organisms ranging from plants to humans. The new enzyme requires oxygen and 2oxogluturate as cosubstrates and is related, but distinct from a known enzyme that oxidises arginine to produce guanidine, but which can also oxidise 2-oxogluturate to produce the plant signalling molecule ethylene.

      Overall, I thought this was an exceptionally well written and interesting manuscript. Although a 2-oxogluturate dependent guanidine forming enzyme is known (EFE), the discovery that a related enzyme oxidises homoarginine is really interesting, especially given the presence of homoarginine in plant seeds. There is more work to be done in terms of functional assignment, but this can be the subject of future studies. I also fully endorse the authors' view that guanidine and related compounds have been massively understudied in recent times. I would like to see the possibility that the new enzyme makes ethylene explored. Congratulations to the authors on a very nice study.

      Response: We thank the reviewer for the positive evaluation of our manuscript. In the revised version, we have emphasized more clearly that we found no evidence for ethylene production by the recombinant enzymes. The other suggestions of the reviewer are also considered in the revised version as detailed below.

      Reviewer #2 (Public Review):

      In this study, Dietmar Funck and colleagues have made a significant breakthrough by identifying three isoforms of plant 2-oxoglutarate-dependent dioxygenases (2-ODD-C23) as homo/arginine-6-hydroxylases, catalyzing the degradation of 6-hydroxyhomoarginine into 2aminoadipate-6-semialdehyde (AASA) and guanidine. This discovery marks the very first confirmation of plant or eukaryotic enzymes capable of guanidine production.

      The authors selected three plant 2-ODD-C23 enzymes with the highest sequence similarity to bacterial guanidine-producing (EFE) enzymes. They proceeded to clone and express the recombinant enzymes in E coli, demonstrating capacity of all three Arabidopsis isoforms to produce guanidine. Additionally, by precise biochemical experiments, the authors established these three 2-ODD-C23 enzymes as homoarginine-6-hydroxylases (and arginine-hydroxylase for one of them). Furthermore, the authors utilized transgenic plants expressing GFP fusion proteins to show the cytoplasmic localization of all three 2-ODD-C23 enzymes. Most notably, using T-DNA mutant lines and CRISPR/Cas9-generated lines, along with combinations of them, they demonstrate the guanidine-producing capacity of each enzyme isoform in planta. These results provide robust evidence that these three 2-ODD-C23 Arabidopsis isoforms are indeed homoarginine-6-hydroxylases responsible for guanidine generation.

      The findings presented in this manuscript are a significant contribution for our understanding of plant biology, particularly given that this work is the first demonstration of enzymatic guanidine production in eukaryotic cells. However, there are a couple of concerns and potential ways for further investigation that the authors should (consider) incorporate.

      Firstly, the observation of cytoplasmic and nuclear GFP signals in the transgenic plants may also indicate cleaved GFP from the fusion proteins. Thus, the authors should perform Western blot analysis to confirm the correct size of the 2-ODD-C23 fusion proteins in the transgenic protoplasts.

      Secondly, it may be worth measuring pipecolate (and proline?) levels under biotic stress conditions (particularly those that induce transcript changes of these enzymes, Fig S8). Given the results suggesting a potential regulation of the pathway by biotic stress conditions (eg. meJA), these experiments could provide valuable insights into the physiological role of guanidine-producing enzymes in plants. This additional analysis may give a significance of these enzymes in plant defense mechanisms.

      Response: We thank also reviewer 2 for the positive evaluation and useful suggestions. We performed the proposed GFP Western blot, which indeed indicated the presences of both, fulllength fusion proteins and free GFP, which can explain the partial nuclear localization. We fully agree that further experiments with biotic and abiotic stress will be required to determine the physiological function of the 2-ODD-C23 enzymes. However, the list of potential experiments is long and they are beyond the scope of the present manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Specific points

      Overall, I thought this was a very interesting study, comprising biochemical, cellular, and in vivo studies. Of course more could be done on each of these, and likely will be, but I think the assignment of biochemical function is very strong, across all three approaches. The one new experiment I would like to see is a clear demonstration of whether ethylene is produced - unlikely but should be tested.

      We had mentioned our failure to detect ethylene production by the plant enzymes in the previous version and have made it more prominent and reliable by including ethylene production as positive control in the new supplementary figure S5.

      Abstract

      Delete 'hitherto overlooked' - this is implicit 'but is more likely' to 'is likely'?

      Agreed and modified

      Introduction

      Second sentence - what about relevant small molecule primary metabolites including precursors of proteins/nucleic acids.

      We modified the sentence accordingly.

      Paragraph 2 - maybe also note EFE produces glutamate semi aldehyde, via arginine C-5 oxidation.

      Paragraph 2 has been re-phrased according to your suggestion.

      Overall, I thought the introduction was exceptionally well written.

      Perhaps either in the introduction, or later, note there are other 2OG oxygenases that oxidise arginine/arginine derivatives in various ways, e.g. clavaminate synthase/arginine hydroxylases/desaturases.

      We added a sentence mentioning the arginine hydroxylases VioC and OrfP to the introduction and included VioC into the sequence comparison in supplementary figure 2 to show that these enzymes, as well as NapI, are very different from EFE and the plant hydroxylases.

      Results

      Paragraph 1 - qualify similarity and refer to/give a structurally informed sequence alignment, including EFE

      A new supplemental figure S2 was added with sequence identity values and a structurally informed alignment. The text has been modified accordingly.

      Paragraph 2 - briefly state method of guanidine analysis

      We included a reference to the M&M section and mentioned LC-MS in paragraph 2.

      Figure 1 - trivial point - proteins are not expressed/genes are

      We have modified the legend to figure 1. However, we would like to point out that terms like “recombinant protein expression” are widely used in the field. A quick search with google Ngram viewer shows that “protein expression” started to appear in the mid-80ies and its use stayed constantly at 1/8th of “gene expression”.

      Define errors clearly in all figure legends, clearly defining biological/technical repeats<br /> Page 6 - was the His-tag cleared to ensure no issues with Ni contamination?

      We treat individual plants or independent bacterial cultures as biological replicates. Only in the case of enzyme activity assays with NAD(P)H, technical replicates were used and this has been indicated in the legend of figure 6.

      Lower case 'p' in pentafluorobenzyl corrected

      In Figure 2 make clear the hydroxylated intermediates are not observed

      We now use grey color for the intermediates and have put them in brackets. Additionally we state in the figure legend that these intermediates were not detected.

      Pages 6-7 - I may have missed this but it's important to investigate what happens to the 2OG. Is succinate the only product or is ethylene also produced? This possibility should also be considered in the plant studies, i.e. is there any evidence for responses related to perturbed ethylene metabolism. The authors consider a signalling role relating to AASA/P6C, but seem to ignore a potential ethylene connection.

      As stated above, we checked for ethylene production with negative result. EFE produced 6 times more guanidine than the plant enzymes under the same condition, but even 100-fold lower ethylene production would have been clearly detected.

      Page 12 - 'plants have been shown to....' Perhaps note how hydroxy guanidine is made?

      We now mention the canavanine-γ-lyase that cleaves canavanine into hydroxyguanidine and homoserine.

      Overall, I thought the discussion was good, but perhaps a bit long/too speculative on pages 12/13 and this detracted from the biochemical assignment of the enzyme. I'd suggest shortening the discussion somewhat - the precise roles of the enzyme can be the subject of future work. As indicated above, some discussion on potential links to ethylene would be appreciated.

      Since reviewer 2 wanted more (speculative) discussion on the role of the 2-ODD-C23 enzymes and there was no detectable ethylene production, we took the liberty to leave the discussion largely unaltered.

      I'd also like to see some more consideration/metabolic analyses of guanidine related metabolism in the genetically modified plants.

      Such analyses will certainly be included in future experiments once we get an idea about the physiological role of the 2-ODD-C23 enzymes.

      Page 16 - mass spectrometry

      Corrected.

      Please add a structurally informed sequence alignment with EFE and other 2OG oxygenases acting on arginine/derivatives.

      An excerpt of the alignment is now presented in supplementary figure S2.

      Reviewer #2 (Recommendations For The Authors):

      I would like to see more discussion in the manuscript about the possible interconnection/roles between 2-ODD-C23 guanidine-producing, lysine- ALD1-Pipecolate producing, and proline metabolism pathways during both biotic and abiotic stresses.

      Since we were unable to detect pipecolate in any of our plant samples and also our preliminary results with biotic stress did not produce any evidence for a function of the 2ODD-C23 enzymes in the tested defense responses, we would like to postpone such extended discussion until we find a condition where the physiological function of these enzymes is evident.

      Fig. 4: Authors should change colors for Col-0, 0.2 HoArg and ctrl? They look too similar in my pdf file.

      We changed the colors in figure 4 and hope that the enhanced contrast is maintained during the production of the final version of our article.

    3. Reviewer #1 (Public Review):

      Nitrogen metabolism is of fundamental importance to biology. However, the metabolism and biochemistry of guanidine and guanidine containing compounds, including arginine and homoarginine, have been understudied over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new type of guanidine forming enzyme. It was previously known that 2-oxogluturate oxygenase catalysis in bacteria can produce guanidine via oxidation of arginine. Interestingly, the same reported enzyme that produces guanidine from arginine also oxidises 2-oxogluturate to give the plant signalling molecule ethylene. Funck et al show that a mechanistically related oxygenase enzyme from plants can also produce guanidine, but instead of using arginine as a substrate, it uses homoarginine and does not produce ethylene. The work will stimulate interest in the cellular roles of homoarginine, a metabolite present in plants and other organisms including humans and, more generally, in the biochemistry and metabolism of guanidine derivatives.

      1. Significance<br /> Studies on the metabolism and biochemistry of the small nitrogen rich molecule guanidine and related compounds including arginine have been largely ignored over the last few decades. Very few guanidine forming enzymes have been identified. Funck et al define a new guanidine forming enzyme that works by oxidation of homoarginine, a metabolite present in organisms ranging from plants to humans. The new enzyme requires oxygen and 2-oxogluturate as cosubstrates and is related, but distinct from a known enzyme that oxidises arginine to produce guanidine, but which can also oxidise 2-oxogluturate to produce the plant signalling molecule ethylene.

      I thought this was an exceptionally well-written and interesting manuscript. Although a 2-oxogluturate dependent guanidine forming enzyme is known (EFE), the discovery that a related enzyme oxidises homoarginine is really interesting, especially given the presence of homoarginine in plant seeds. There is more work to be done in terms of functional assignment, but this can be the subject of future studies. I also fully endorse the authors' view that guanidine and related compounds have been massively understudied in recent times. Congratulations to the authors on a very nice study.

      Overall, I thought this was a very interesting study, comprising biochemical, cellular, and in vivo studies. Of course, more could be done on each of these, and likely will be, but I think the assignment of biochemical function is very strong, across all three approaches. The one new experiment I requested was a demonstration of whether ethylene is produced by the new enzymes - this was clearly shown not to be the case.

    4. Reviewer #2 (Public Review):

      In this study, Dietmar Funck and colleagues have made a significant breakthrough by identifying three isoforms of plant 2-oxoglutarate-dependent dioxygenases (2-ODD-C23) as homo/arginine-6-hydroxylases, catalyzing the degradation of 6-hydroxyhomoarginine into 2-aminoadipate-6-semialdehyde (AASA) and guanidine. This discovery marks the very first confirmation of plant or eukaryotic enzymes capable of guanidine production.

      The authors selected three plant 2-ODD-C23 enzymes with the highest sequence similarity to bacterial guanidine-producing (EFE) enzymes. They proceeded to clone and express the recombinant enzymes in E coli, demonstrating capacity of all three Arabidopsis isoforms to produce guanidine. Additionally, by precise biochemical experiments, the authors established these three 2-ODD-C23 enzymes as homoarginine-6-hydroxylases (and arginine-hydroxylase for one of them). Furthermore, the authors utilized transgenic plants expressing GFP fusion proteins to show the cytoplasmic localization of all three 2-ODD-C23 enzymes. Most notably, using T-DNA mutant lines and CRISPR/Cas9-generated lines, along with combinations of them, they demonstrate the guanidine-producing capacity of each enzyme isoform in planta. These results provide robust evidence that these three 2-ODD-C23 Arabidopsis isoforms are indeed homoarginine-6-hydroxylases responsible for guanidine generation.<br /> The findings presented in this manuscript are a significant contribution for our understanding of plant biology, particularly given that this work is the first demonstration of enzymatic guanidine production in eukaryotic cells. However, there are a couple of concerns and potential ways for further investigation that the authors should (consider) incorporate.

      Firstly, the observation of cytoplasmic and nuclear GFP signals in the transgenic plants may also indicate cleaved GFP from the fusion proteins. Thus, the authors should perform a Western blot analysis to confirm the correct size of the 2-ODD-C23 fusion proteins in the transgenic protoplasts.

      Secondly, it may be worth measuring pipecolate (and proline?) levels under biotic stress conditions (particularly those that induce transcript changes of these enzymes, Fig S8). Given the results suggesting a potential regulation of the pathway by biotic stress conditions (eg. meJA), these experiments could provide valuable insights into the physiological role of guanidine-producing enzymes in plants. This additional analysis may give a significance of these enzymes in plant defense mechanisms.

    1. Author Response

      eLife assessment

      This study presents a valuable finding on a new role of Foxp3+ regulatory T cells in sensory perception, which may have an impact on our understanding of somatosensory perception. The authors identified a previously unappreciated action of enkephalins released by immune cells in the resolution of pain and several upstream signals that can regulate the expression of the proenkephalin gene PENK in Foxp3+ Tregs. However, whereas the generation of transgenic mice with conditional deletion of PENK in Foxp3+ cells and PENK fate-mapping is novel and generates compelling data, they show an incomplete analysis of Tregs in the control and transgenic mice, proper tamoxifen controls nor the role of PENK+ skin T cells to further support their hypothesis. Nonetheless, the study would be of interest to the biologists working in the field of neuroimmunology and inflammation.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors explore mechanisms through which T-regs attenuate acute pain using a heat sensitivity paradigm. Analysis of available transcriptomic data revealed expression on the proenkephalin (Penk) gene in T-regs. The authors explore the contribution of T-reg Penk in the resolution of heat sensitivity.

      Strengths:

      Investigating the potential role of T-reg Penk in the resolution of acute pain is a strength.

      Weaknesses:

      The overall experimental design is superficial and lacks sufficient rigor to draw any meaningful conclusions.

      For instance:

      1) The were no TAM controls. What is the evidence that TAM does not alter heat-sensitive receptors.

      Author response : By comparing panel A and C, it appears that heat-sensitivity in controls (blue dots) is slightly different before and after TMX administration, suggesting that heat-sensitive receptors are moderately altered by TMX per se. However, heat sensitivity is increased by two fold in KO animals. Thus, a possible effect of TAM on heat receptors is not responsible for the heat hyperalgesia seen in KO, as shown in figure 4 and S3.

      2) There are no controls demonstrating that recombination actually occurred. How do the authors know a single dose of TAM is sufficient?

      Author response : these experiments are in progress. Specificity of the deletion will be presented in an updated version of the manuscript in the near future.

      3) Why was only heat sensitivity assessed? The behavioral tests are inadequate to derive any meaningful conclusions. Further, why wasn't the behavioral data plotted longitudinally

      Author response : We respectfuly point the reviewer to figure S3 where the longitudinal data are presented. New behavorial tests are being performed. The results will be presented in a revised version.

      Reviewer #2 (Public Review):

      Summary:

      The present study addresses the role of enkephalins, which are specifically expressed by regulatory T cells (Treg), in sensory perception in mice. The authors used a combination of transcriptomic databases available online to characterize the molecular signature of Treg. The proenkephalin gene Penk is among the most enriched transcripts, suggesting that Treg plays an analgesic role through the release of endogenous opioids. In addition, in silico analysis suggests that Penk is regulated by the TNFR superfamily; this being experimentally confirmed. Using flow cytometry analysis, the authors then show that Penk is mostly expressed in Treg of the skin and colon, compared to other immune cells. Finally, genetic conditional excision of Penk, selectively in Treg, results in heat hypersensitivity, as assessed by behavior analysis.

      Strengths:

      The manuscript is clear and reveals a previously unappreciated role of enkephalins, as released by immune cells, in sensory perception. The rationale in this manuscript is easy to follow, and conclusions are well supported by data.

      Weaknesses:

      The sensory deficit of Penk cKO appears to be quite limited compared to control littermates.

      Reviewer #3 (Public Review):

      Summary:

      Aubert et al investigated the role of PENK in regulatory T cells. Through the mining of publicly available transcriptome data, the authors confirmed that PENK expression is selectively enriched in regulatory but not conventional T cells. Further data mining suggested that OX40, 4-1BB as well as BATF, can regulate PENK expression in Tregs. The authors generated fate-mapping mice to confirm selective PENK expression in Tregs and activated effector T cells in the colon and spleen. Interestingly, transgenic mice with conditional deletion of PENK in Tregs resulted in hypersensitivity to heat, which the authors attributed to heat hyperalgesia.

      Strengths:

      The generation of transgenic mice with conditional deletion of PENK in foxp3 and PENK fate-mapping is novel and can potentially yield significant findings. The identification of upstream signals that regulate PENK is interesting but unlikely to be the main reason why PENK is predominantly expressed in Tregs as both BATF and TNFR are expressed in effector T cells.

      Weaknesses:

      There is a lack of direct evidence and detailed analysis of Tregs in the control and transgenic mice to support the authors' hypothesis. PENK was previously reported to be expressed in skin Tregs and play a significant role in regulating skin homeostasis: this should be considered as an alternative mechanism that may explain the changed sensitivity to heat observed in the paper.

      Author response : Supplementary figures are being prepared and new results are being collected to show that the KO do not perturb immune and/or skin homeostasis at the time of the experiments. These will be presented in a revised version.

    2. eLife assessment

      This study presents a valuable finding on a new role of Foxp3+ regulatory T cells in sensory perception, which may have an impact on our understanding of somatosensory perception. The authors identified a previously unappreciated action of enkephalins released by immune cells in the resolution of pain and several upstream signals that can regulate the expression of the proenkephalin gene PENK in Foxp3+ Tregs. However, whereas the generation of transgenic mice with conditional deletion of PENK in Foxp3+ cells and PENK fate-mapping is novel and generates compelling data, they show an incomplete analysis of Tregs in the control and transgenic mice, proper tamoxifen controls nor the role of PENK+ skin T cells to further support their hypothesis. Nonetheless, the study would be of interest to the biologists working in the field of neuroimmunology and inflammation.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The authors explore mechanisms through which T-regs attenuate acute pain using a heat sensitivity paradigm. Analysis of available transcriptomic data revealed expression on the proenkephalin (Penk) gene in T-regs. The authors explore the contribution of T-reg Penk in the resolution of heat sensitivity.

      Strengths:<br /> Investigating the potential role of T-reg Penk in the resolution of acute pain is a strength.

      Weaknesses:<br /> The overall experimental design is superficial and lacks sufficient rigor to draw any meaningful conclusions.

      For instance:<br /> 1) The were no TAM controls. What is the evidence that TAM does not alter heat-sensitive receptors.<br /> 2) There are no controls demonstrating that recombination actually occurred. How do the authors know a single dose of TAM is sufficient?<br /> 3) Why was only heat sensitivity assessed? The behavioral tests are inadequate to derive any meaningful conclusions. Further, why wasn't the behavioral data plotted longitudinally

    4. Reviewer #2 (Public Review):

      Summary:<br /> The present study addresses the role of enkephalins, which are specifically expressed by regulatory T cells (Treg), in sensory perception in mice. The authors used a combination of transcriptomic databases available online to characterize the molecular signature of Treg. The proenkephalin gene Penk is among the most enriched transcripts, suggesting that Treg plays an analgesic role through the release of endogenous opioids. In addition, in silico analysis suggests that Penk is regulated by the TNFR superfamily; this being experimentally confirmed. Using flow cytometry analysis, the authors then show that Penk is mostly expressed in Treg of the skin and colon, compared to other immune cells. Finally, genetic conditional excision of Penk, selectively in Treg, results in heat hypersensitivity, as assessed by behavior analysis.

      Strengths:<br /> The manuscript is clear and reveals a previously unappreciated role of enkephalins, as released by immune cells, in sensory perception. The rationale in this manuscript is easy to follow, and conclusions are well supported by data.

      Weaknesses:<br /> The sensory deficit of Penk cKO appears to be quite limited compared to control littermates.

    5. Reviewer #3 (Public Review):

      Summary:<br /> Aubert et al investigated the role of PENK in regulatory T cells. Through the mining of publicly available transcriptome data, the authors confirmed that PENK expression is selectively enriched in regulatory but not conventional T cells. Further data mining suggested that OX40, 4-1BB as well as BATF, can regulate PENK expression in Tregs. The authors generated fate-mapping mice to confirm selective PENK expression in Tregs and activated effector T cells in the colon and spleen. Interestingly, transgenic mice with conditional deletion of PENK in Tregs resulted in hypersensitivity to heat, which the authors attributed to heat hyperalgesia.

      Strengths:<br /> The generation of transgenic mice with conditional deletion of PENK in foxp3 and PENK fate-mapping is novel and can potentially yield significant findings. The identification of upstream signals that regulate PENK is interesting but unlikely to be the main reason why PENK is predominantly expressed in Tregs as both BATF and TNFR are expressed in effector T cells.

      Weaknesses:<br /> There is a lack of direct evidence and detailed analysis of Tregs in the control and transgenic mice to support the authors' hypothesis. PENK was previously reported to be expressed in skin Tregs and play a significant role in regulating skin homeostasis: this should be considered as an alternative mechanism that may explain the changed sensitivity to heat observed in the paper.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors have developed a compelling coarse-grained simulation approach for nucleosome-nucleosome interactions within a chromatin array. The data presented are solid and provide new insights that allow for predictions of how chromatin interactions might occur in vivo, but some of the claims should be tempered. The tools will be valuable for the chromosome biology field.

      Response: We want to thank the editors and all the reviewers for their insightful comments. We have made substantial changes to the manuscript to improve its clarity and temper necessary claims, as detailed in the responses, and we performed additional analyses to address the reviewers’ concerns. We believe that we have successfully addressed all the comments, and the quality of our paper has improved significantly.

      In the following, we provide point-to-point responses to all the reviewer comments. 

      RESPONSE TO REFEREE 1:

      Comment 0: This study develops and applies a coarse-grained model for nucleosomes with explicit ions. The authors perform several measurements to explore the utility of a coarse-grained simulation method to model nucleosomes and nucleosome arrays with explicit ions and implicit water. ’Explicit ions’ means that the charged ions are modeled as particles in simulation, allowing the distributions and dynamics of ions to be measured. Since nucleosomes are highly charged and modulated by charge modifications, this innovation is particularly relevant for chromatin simulation.

      Response: We thank the reviewer’s excellent summary of the work.

      Comment 1: Strengths: This simulation method produces accurate predictions when compared to experiments for the binding affinity of histones to DNA, counterion interactions, nucleosome DNA unwinding, nucleosome binding free energies, and sedimentation coefficients of arrays. The variety of measured quantities makes both this work and the impact of this coarse-grained methodology compelling. The comparison between the contributions of sodium and magnesium ions to nucleosome array compaction, presented in Figure 3, was exciting and a novel result that this simulation methodology can assess.

      Response: We appreciate the reviewer’s strong assessment of the paper’s significance, novelty, and broad interest, and we thank him/her for the detailed suggestions and comments.

      Comment 2: Weaknesses: The presentation of experimental data as representing in vivo systems is a simplification that may misrepresent the results of the simulation work. In vivo, in this context, typically means experimental data from whole cells. What one could expect for in vivo experimental data is measurements on nucleosomes from cell lysates where various and numerous chemical modifications are present. On the contrary, some of the experimental data used as a comparison are from in vitro studies. In vitro in this context means nucleosomes were formed ’in a test tube’ or under controlled conditions that do not represent the complexity of an in vivo system. The simulations performed here are more directly compared to in vitro conditions. This distinction likely impacts to what extent these simulation results are biologically relevant. In vivo and in vitro differences could be clarified throughout and discussed.

      Response: As detailed in Response to Comment 3, we have made numerous modifications in the Introduction, Results, and Discussion Section to emphasize the differences between reconstituted and native nucleosomes. The newly added texts also delve into the utilization of the interaction strength measured for reconstituted nucleosomes as a reference point for conceptualizing the interactions among native nucleosomes.

      Comment 3: In the introduction (pg. 3), the authors discuss the uncertainty of nucleosome-tonucleosome interaction strengths in vivo. For example, the authors discuss works such as Funke et al. However, Funke et al. used reconstituted nucleosomes from recombinant histones with one controlled modification (H4 acetylation). Therefore, this study that the authors discuss is measuring nucleosome’s in vitro affinity, and there could be significant differences in vivo due to various posttranslational modifications. Please revise the introduction, results section ”Close contacts drive nucleosome binding free energy,” and discussion to reflect and clarify the difference between in vitro and in vivo measurements. Please also discuss how biological variability could impact your findings in vivo. The works of Alexey Onufriev’s lab on the sensitivity of nucleosomes to charge changes (10.1016/j.bpj.2010.06.046, 10.1186/s13072-018-0181-5), such as some PTMs, are one potential starting place to consider how modifications alter nucleosome stability in vivo.

      Response: We thank the reviewer for the insightful comments and agree that native nucleosomes can differ from reconstituted nucleosomes due to the presence of histone modifications.

      We have revised the introduction to emphasize the differences between in vitro and in vivo nucleosomes. The new text now reads

      "The relevance of physicochemical interactions between nucleosomes to chromatin organization in vivo has been constantly debated, partly due to the uncertainty in their strength [cite]. Examining the interactions between native nucleosomes poses challenges due to the intricate chemical modifications that histone proteins undergo within the nucleus and the variations in their underlying DNA sequences [cite]. Many in vitro experiments have opted for reconstituted nucleosomes that lack histone modifications and feature wellpositioned 601-sequence DNA to simplify the chemical complexity. These experiments aim to establish a fundamental reference point for understanding the strength of interactions within native nucleosomes. Nevertheless, even with reconstituted nucleosomes, a consensus regarding the significance of their interactions remains elusive. For example, using force-measuring magnetic tweezers, Kruithof et al. estimated the inter-nucleosome binding energy to be ∼ 14 kBT [cite]. On the other hand, Funke et al. introduced a DNA origamibased force spectrometer to directly probe the interaction between a pair of nucleosomes [cite], circumventing any potential complications from interpretations of single molecule traces of nucleosome arrays. Their measurement reported a much weaker binding free energy of approximately 2 kBT. This large discrepancy in the reported reference values complicates a further assessment of the interactions between native nucleosomes and their contribution to chromatin organization in vivo."

      We modified the first paragraph of the results section to read

      "Encouraged by the explicit ion model’s accuracy in reproducing experimental measurements of single nucleosomes and nucleosome arrays, we moved to directly quantify the strength of inter-nucleosomes interactions. We once again focus on reconstituted nucleosomes for a direct comparison with in vitro experiments. These experiments have yielded a wide range of values, ranging from 2 to 14 kBT [cite]. Accurate quantification will offer a reference value for conceptualizing the significance of physicochemical interactions among native nucleosomes in chromatin organization in vivo."

      New text was added to the Discussion Section to emphasize the implications of simulation results for interactions among native nucleosomes.

      "One significant finding from our study is the predicted strong inter-nucleosome interactions under the physiological salt environment, reaching approximately 9 kBT. We showed that the much lower value reported in a previous DNA origami experiment is due to the restricted nucleosomal orientation inherent to the device design. Unrestricted nucleosomes allow more close contacts to stabilize binding. A significant nucleosome binding free energy also agrees with the high forces found in single-molecule pulling experiments that are needed for chromatin unfolding [cite]. We also demonstrate that this strong inter-nucleosomal interaction is largely preserved at longer nucleosome repeat lengths (NRL) in the presence of linker histone proteins. While posttranslational modifications of histone proteins may influence inter-nucleosomal interactions, their effects are limited, as indicated by Ding et al. [cite], and are unlikely to completely abolish the significant interactions reported here. Therefore, we anticipate that, in addition to molecular motors, chromatin regulators, and other molecules inside the nucleus, intrinsic inter-nucleosome interactions are important players in chromatin organization in vivo."

      The suggested references (10.1016/j.bpj.2010.06.046, 10.1186/s13072-018-0181-5) are now included as citations # 44 and 45.

      Comment 4: Due to the implicit water model, do you know if ions can penetrate the nucleosome more? For example, does the lack of explicit water potentially cause sodium to cluster in the DNA grooves more than is biologically relevant, as shown in Figure 1?

      Response: We thank the reviewer for the insightful comments. The parameters of the explicit-ion model were deduced from all-atom simulations and fine-tuned to replicate crucial aspects of the local ion arrangements around DNA (1). The model’s efficacy was demonstrated in reproducing the radial distribution function of Na+ and Mg2+ ion distributions in the proximity of DNA (see Author response image 1). Consequently, the number of ions near DNA in the coarse-grained models aligns with that observed in all-atom simulations, and we do not anticipate any significant, unphysical clustering. It is worth noting that previous atomistic simulations have also reported the presence of a substantial quantity of Na+ ions in close proximity to nucleosomal DNA (refer to Author response image 2).

      Author response image 1.

      Comparison between the radial distribution functions of Na+ (left) and Mg2+ (right) ions around the DNA phosphate groups computed from all-atom (black) and coarse-grained (red) simulations. Figure adapted from Figure 4 of Ref. 1. The coarse-grained explicit ion model used in producing the red curves is identical to the one presented in the current manuscript.

      (© 2011, AIP Publishing. This figure is reproduced with permission from Figure 4 in Freeman GS, Hinckley DM, de Pablo JJ (2011) A coarse-grain three-site-pernucleotide model for DNA with explicit ions. The Journal of Chemical Physics 135:165104. It is not covered by the CC-BY 4.0 license and further reproduction of this figure would need permission from the copyright holder.)

      Author response image 2.

      Three-dimensional distribution of sodium ions around the nucleosome determined from all-atom explicit solvent simulations. Darker blue colors indicate higher sodium density and high density of sodium ions around the DNA is clearly visible. The crystallographically identified acidic patch has been highlighted as spheres on the surface of the histone core and a high level of sodium condensation is observed around these residues. Figure adapted from Ref. 2.

      (© 2009, American Chemical Society. This figure is reproduced with permission from Figure 7 in Materese CK, Savelyev A, Papoian GA (2009) Counterion Atmosphere and Hydration Patterns near a Nucleosome Core Particle. J. Am. Chem. Soc. 131:15005–15013.. It is not covered by the CC-BY 4.0 license and further reproduction of this figure would need permission from the copyright holder.)

      Comment 5: Histone side chain to DNA interactions, such as histone arginines to DNA, are essential for nucleosome stability. Therefore, can the authors provide validation or references supporting your model of the nucleosome with one bead per amino acid? I would like to see if the nucleosomes are stable in an extended simulation or if similar dynamic motions to all-atom simulations are observed.

      Response: The nucleosome model, which employs one bead per amino acid and lacks explicit ions, has undergone extensive calibration and has found application in numerous prior studies. For instance, the de Pablo group utilized a similar model to showcase its ability to accurately replicate the experimentally measured nucleosome unwinding free energy penalty (3), sequence-dependent nucleosome sliding (4), and the interaction between two nucleosomes (5). Similarly, the Takada group employed a comparable model to investigate acetylation-modulated tri-nucleosome structures (6), chromatin structures influenced by chromatin factors (7), and nucleosome sliding (8). Our group also employed this model to study the structural rearrangement of a tetranucleosome (9) and the folding of larger chromatin systems (10). In cases where data were available, simulations frequently achieved quantitative reproduction of experimental results.

      We added the following text to the manuscript to emphasize previous studies that validate the model accuracy.

      "We observe that residue-level coarse-grained models have been extensively utilized in prior studies to examine the free energy penalty associated with nucleosomal DNA unwinding [cite], sequence-dependent nucleosome sliding [cite], binding free energy between two nucleosomes [cite], chromatin folding [cite], the impact of histone modifications on tri-nucleosome structures [cite], and protein-chromatin interactions [cite]. The frequent quantitative agreement between simulation and experimental results supports the utility of such models in chromatin studies. Our introduction of explicit ions, as detailed below, further extends the applicability of these models to explore the dependence of chromatin conformations on salt concentrations."

      We agree that arginines are important for nucleosome stability. Since we assign positive charges to these residues, their contribution to DNA binding can be effectively captured. The model’s ability in reproducing nucleosome stability is supported by the good agreement between the simulated free energy penalty associated with nucleosomal DNA unwinding and experimental value estimated from single molecule experiments (Figure 1).

      To further evaluate nucleosome stability in our simulations, we conducted a 200-ns-long simulation of a nucleosome featuring the 601-sequence under physiological salt conditions– 100 mM NaCl and 0.5 mM MgCl2, consistent with the conditions in Figure 1 of the main text. We found that the nucleosome maintains its overall structure during this simulation. The nucleosome’s radius of gyration (Rg) remained proximate to the value corresponding to the PDB structure (3.95 nm) throughout the entire simulation period (see Author response image 3).

      Author response image 3.

      Time trace of the radius of gyration (Rg) of a nucleosome with the 601-sequence along an unbiased, equilibrium trajectory. It is evident the Rg fluctuates around the value found in the PDB structure (3.95 nm), supporting the stability of the nucleosome in our simulation.

      Occasional fluctuations in Rg corresponded to momentary, partial unwrapping of the nucleosomal DNA, a phenomenon observed in single-molecule experiments. However, we advise caution due to the coarse-grained nature of our simulations, which prevents a direct mapping of simulation timescale to real time. Importantly, the rate of DNA unwrapping in our simulations is notably overestimated.

      It’s plausible that coarse-grained models, lacking side chains, might underestimate the barrier for DNA sliding along the nucleosome. Specifically, our model, without differentiation between interactions among various amino acids and nucleotides, accurately reproduces the average nucleosomal DNA binding affinity but may not capture the energetic variations among binding interfaces. Since sliding’s contribution to chromatin organization is minimal due to the use of strongly positioning 601 sequences, we imposed rigidity on the two nucleotides situated at the dyad axis to prevent nucleosomal DNA sliding. In future studies, enhancing the calibration of protein-DNA interactions to achieve improved sequence specificity would be an intriguing avenue. To underscore this limitation of the model, we have included the following text in the discussion section of the main text.

      "Several aspects of the coarse-grained model presented here can be further improved. For instance, the introduction of specific protein-DNA interactions could help address the differences in non-bonded interactions between amino acids and nucleotides beyond electrostatics [cite]. Such a modification would enhance the model’s accuracy in predicting interactions between chromatin and chromatin-proteins. Additionally, the single-bead-per-amino-acid representation used in this study encounters challenges when attempting to capture the influence of histone modifications, which are known to be prevalent in native nucleosomes. Multiscale simulation approaches may be necessary [cite]. One could first assess the impact of these modifications on the conformation of disordered histone tails using atomistic simulations. By incorporating these conformational changes into the coarse-grained model, systematic investigations of histone modifications on nucleosome interactions and chromatin organization can be conducted. Such a strategy may eventually enable the direct quantification of interactions among native nucleosomes and even the prediction of chromatin organization in vivo."

      Comment 6: The solvent salt conditions vary in the experimental reference data for internucleosomal interaction energies. The authors note, for example, that the in vitro data from Funke et al. differs the most from other measurements, but the solvent conditions are 35 mM NaCl and 11 mM MgCl2. Since this simulation method allows for this investigation, could the authors speak to or investigate if solvent conditions are responsible for the variability in experimental reference data? The authors conclude on pg. 8-9 and Figure 4 that orientational restraints in the DNA origami methodology are responsible for differences in interaction energy. Can the authors rule out ion concentration contributions?

      Response: We thank the reviewer for the insightful comment. We would like to clarify that the black curve presented in Figure 4B of the main text was computed using the salt concentration specified by Funke et al. (35 mM NaCl and 11 mM MgCl2). Furthermore, there were no restraints placed on nucleosome orientations during these calculations. Consequently, the results in Figure 4B can be directly compared with the black curve in Figure 5C. The data in Figure 5C were calculated under physiological salt conditions (150 mM NaCl and 2 mM MgCl2), which are the standard solvent salt conditions used in most studies. It is worth noting that the free energy of nucleosome binding is significantly higher at the salt concentration employed by Funke et al. (14 kBT) than the value at the physiological salt condition (9 kBT). Therefore, comparing the results in Figure 4B and 5C eliminates ion concentration conditions as a potential cause for the the almost negligible result reported by Funke et al.

      Comment 7: In the discussion on pg. 12 residual-level should be residue-level.

      Response: We apologize for the oversight and have corrected the grammatical error in our manuscript.

      RESPONSE TO REFEREE 2:

      Comment 0: In this manuscript, the authors introduced an explicit ion model using the coarse-grained modelling approach to model the interactions between nucleosomes and evaluate their effects on chromatin organization. The strength of this method lies in the explicit representation of counterions, especially divalent ions, which are notoriously difficult to model. To achieve their aims and validate the accuracy of the model, the authors conducted coarse-grained molecular dynamics simulations and compared predicted values to the experimental values of the binding energies of protein-DNA complexes and the free energy profile of nucleosomal DNA unwinding and inter-nucleosome binding. Additionally, the authors employed umbrella sampling simulations to further validate their model, reproducing experimentally measured sedimentation coefficients of chromatin under varying salt concentrations of monovalent and divalent ions.

      Response: We thank the reviewer’s excellent summary of the work.

      Comment 1: The significance of this study lies in the authors’ coarse-grained model which can efficiently capture the conformational sampling of molecules while maintaining a low computational cost. The model reproduces the scale and, in some cases, the shape of the experimental free energy profile for specific molecule interactions, particularly inter-nucleosome interactions. Additionally, the authors’ method resolves certain experimental discrepancies related to determining the strength of inter-nucleosomal interactions. Furthermore, the results from this study support the crucial role of intrinsic physicochemical interactions in governing chromatin organization within the nucleus.

      Response: We appreciate the reviewer’s strong assessment of the paper’s significance, novelty, and broad interest, and we thank him/her for the detailed suggestions and comments.

      Comment 2: The method is simple but can be useful, given the authors can provide more details on their ion parameterization. The paper says that parameters in their ”potentials were tuned to reproduce the radial distribution functions and the potential of mean force between ion pairs determined from all-atom simulations.” However, no details on their all-atom simulations were provided; at some point, the authors refer to Reference 67 which uses all-atom simulations but does not employ the divalent ions. Also, no explanation is given for their modelling of protein-DNA complexes.

      Response: We appreciate the reviewer’s suggestion on clarifying the parameterization of the explicition model. The parameterization was not carried out in reference 67 nor by us, but by the de Pablo group in citation 53. Specifically, ion potentials were parameterized to fit the potential of mean force between both monovalent and divalent ion pairs, calculated either from all-atom simulations or from the literature. The authors carried out extensive validations of the model parameters by comparing the radial distribution functions of ions computed using the coarse-grained model with those from all-atom simulations. Good agreements between coarse-grained and all-atom results ensure that the parameters’ accuracy in reproducing the local structures of ion interactions.

      To avoid confusion, we have revised the text from:

      "Parameters in these potentials were tuned to reproduce the radial distribution functions and the potential of mean force between ion pairs determined from all-atom simulations."

      to

      "Parameters in these potentials were tuned by Freeman et al. [cite] to reproduce the radial distribution functions and the potential of mean force between ion pairs determined from all-atom simulations."

      We modified the Supporting Information at several places to clarify the setup and interpretation of protein-DNA complex simulations.

      For example, we clarified the force fields used in these simulation with the following text

      "All simulations were carried out using the software Lammps [cite] with the force fields defined in the previous two sections."

      We added details on the preparation of these simulations as follows

      "We carried out a series of umbrella-sampling simulations to compute the binding free energies of a set of nine protein-DNA complexes with experimentally documented binding dissociation constants [cite]. Initial configurations of these simulations were prepared using the crystal structures with the corresponding PDB IDs listed in Fig. S1."

      We further revised the caption of Figure S1 (included as Author response image 4) to facilitate the interpretation of simulation results.

      Author response image 4.

      The explicit-ion model predicts the binding affinities of protein-DNA complexes well, related to Fig. 1 of the main text. Experimental and simulated binding free energies are compared for nine protein-DNA complexes [cite], with a Pearson Correlation coefficient of 0.6. The PDB ID for each complex is indicated in red, and the diagonal line is drawn in blue. The significant correlation between simulated and experimental values supports the accuracy of the model. To further enhance the agreement between the two, it will be necessary to implement specific non-bonded interactions that can resolve differences among amino acids and nucleotides beyond simple electrostatics. Such modifications will be interesting avenues for future research. See text Section: Binding free energy of protein-DNA complexes for simulation details.

      Comment 3: Overall, the paper is well-written, concise and easy to follow but some statements are rather blunt. For example, the linker histone contribution (Figure 5D) is not clear and could be potentially removed. The result on inter-nucleosomal interactions and comparison to experimental values from Ref#44 is the most compelling. It would be nice to see if the detailed shape of the profile for restrained inter-nucleosomal interactions in Figure 4B corresponds to the experimental profile. Including the dependence of free energy on a vertex angle would also be beneficial.

      Response: We thank the reviewer for the comments and agree that the discussion on linker histone results was brief. However, we believe the results are important and demonstrate our model’s advantage over mesoscopic approaches in capturing the impact of chromatin regulators on chromatin organization.

      Therefore, instead of removing the result, we expanded the text to better highlight its significance, to help its comprehension, and to emphasize its biological implications. The image in Figure 5D was also redesigned to better visualize the cross contacts between nucleosomes mediated by histone H1. The added texts are quoted as below, and the new Figure 5 is included.

      Author response image 5.

      Revised main text Figure 5, with Figure 5D modified for improved visual clarity.

      "Importantly, we found that the weakened interactions upon extending linker DNA can be more than compensated for by the presence of histone H1 proteins. This is demonstrated in Fig. 5C and Fig. S8, where the free energy cost for tearing part two nucleosomes with 167 bp DNA in the presence of linker histones (blue) is significantly higher than the curve for bare nucleosomes (red). Notably, at larger inter-nucleosome distances, the values even exceed those for 147 bp nucleosomes (black). A closer examination of the simulation configurations suggests that the disordered C-terminal tail of linker histones can extend and bind the DNA from the second nucleosome, thereby stabilizing the internucleosomal contacts (as shown in Fig. 5D). Our results are consistent with prior studies that underscore the importance of linker histones in chromatin compaction [cite], particularly in eukaryotic cells with longer linker DNA [cite]."

      We further compared the simulated free energy profile, depicting the center of mass distance between nucleosomes, with the experimental profile, as depicted in Author response image 6. The agreement between the simulated and experimental results is evident. The nuanced features observed between 60 to 80 Ain the simulated profile stem from DNA unwinding˚ to accommodate the incoming nucleosome, creating a small energy barrier. It’s worth noting that such unwinding is unlikely to occur in the experimental setup due to the hybridization method used to anchor nucleosomes onto the DNA origami. Moreover, our simulation did not encompass configurations below 60 A, resulting in a lack of data in˚ that region within the simulated profile.

      We projected the free energy profile onto the vertex angle of the DNA origami device, utilizing the angle between two nucleosome faces as a proxy. Once more, the simulated profile demonstrates reasonable agreement with the experimental data (Author response image 6). Author response image 6 has been incorporated as Figure S4 in the Supporting Information.

      Author response image 6.

      Explicit ion modeling reproduces the experimental free energy profiles of nucleosome binding. (A) Comparison between the simulated (black) and experimental (red) free energy profile as a function of the inter-nucleosome distance. Error bars were computed as the standard deviation of three independent estimates. The barrier observed between 60A and 80˚ A arises from the unwinding of nucleosomal DNA when the two nu-˚ cleosomes are in close proximity, as highlighted in the orange circle. (B) Comparison between the simulated (black) and experimental (red) free energy profile as a function of the vertex angle. Error bars were computed as the standard deviation of three independent estimates. (C) Illustration of the vertex angle Φ used in panel (B).

      Comment 4: Another limitation of this study is that the authors’ model sacrifices certain atomic details and thermodynamic properties of the modelled systems. The potential parameters of the counter ions were derived solely by reproducing the radial distribution functions (RDFs) and potential of mean force (PMF) based on all-atom simulations (see Methods), without considering other biophysical and thermodynamic properties from experiments. Lastly, the authors did not provide any examples or tutorials for other researchers to utilize their model, thus limiting its application.

      Response: We agree that residue-level coarse-grained modeling indeed sacrifices certain atomistic details. This sacrifice can be potentially limiting when studying the impact of chemical modifications, especially on histone and DNA methylations. We added a new paragraph in the Discussion Section to point out such limitations and the relevant text is quoted below.

      "Several aspects of the coarse-grained model presented here can be further improved. For instance, the introduction of specific protein-DNA interactions could help address the differences in non-bonded interactions between amino acids and nucleotides beyond electrostatics [cite]. Such a modification would enhance the model’s accuracy in predicting interactions between chromatin and chromatin-proteins. Additionally, the single-bead-per-amino-acid representation used in this study encounters challenges when attempting to capture the influence of histone modifications, which are known to be prevalent in native nucleosomes. Multiscale simulation approaches may be necessary [cite]. One could first assess the impact of these modifications on the conformation of disordered histone tails using atomistic simulations. By incorporating these conformational changes into the coarse-grained model, systematic investigations of histone modifications on nucleosome interactions and chromatin organization can be conducted. Such a strategy may eventually enable the direct quantification of interactions among native nucleosomes and even the prediction of chromatin organization in vivo."

      Nevertheless, it’s important to note that while the model sacrifices accuracy, it compensates with superior efficiency. Atomistic simulations face significant challenges in conducting extensive free energy calculations required for a quantitative evaluation of ion impacts on chromatin structures.

      The explicit ion model, introduced by the de Pablo group, follows a standard approach adopted by other research groups, such as the parameterization of ion models using the potential of mean force from atomistic simulations (11; 12). According to multiscale coarse-graining theory, reproducing potential mean force (PMF) enables the coarsegrained model to achieve thermodynamic consistency with the atomistic model, ensuring identical statistical properties derived from them. However, it’s crucial to recognize that an inherent limitation of such approaches is their dependence on the accuracy of atomistic force fields in reproducing thermodynamic properties from experiments, as any inaccuracies in the atomistic force fields will similarly affect the resulting coarse-grained (CG) model.

      We have provided the implementation of CG model and detailed instructions on setting up and performing simulations GitHub repository. Examples include simulation setup for a protein-DNA complex and for a nucleosome with the 601-sequence.

      References [1] Freeman GS, Hinckley DM, de Pablo JJ (2011) A coarse-grain three-site-pernucleotide model for DNA with explicit ions. The Journal of Chemical Physics 135:165104.

      [2] Materese CK, Savelyev A, Papoian GA (2009) Counterion Atmosphere and Hydration Patterns near a Nucleosome Core Particle. J. Am. Chem. Soc. 131:15005–15013.

      [3] Lequieu J, Cordoba A, Schwartz DC, de Pablo JJ´ (2016) Tension-Dependent Free Energies of Nucleosome Unwrapping. ACS Cent. Sci. 2:660–666.

      [4] Lequieu J, Schwartz DC, De Pablo JJ (2017) In silico evidence for sequence-dependent nucleosome sliding. Proc. Natl. Acad. Sci. U.S.A. 114.

      [5] Moller J, Lequieu J, de Pablo JJ (2019) The Free Energy Landscape of Internucleosome Interactions and Its Relation to Chromatin Fiber Structure. ACS Cent. Sci. 5:341–348.

      [6] Chang L, Takada S (2016) Histone acetylation dependent energy landscapes in trinucleosome revealed by residue-resolved molecular simulations. Sci Rep 6:34441.

      [7] Watanabe S, Mishima Y, Shimizu M, Suetake I, Takada S (2018) Interactions of HP1 Bound to H3K9me3 Dinucleosome by Molecular Simulations and Biochemical Assays. Biophysical Journal 114:2336–2351.

      [8] Brandani GB, Niina T, Tan C, Takada S (2018) DNA sliding in nucleosomes via twist defect propagation revealed by molecular simulations. Nucleic Acids Research 46:2788–2801.

      [9] Ding X, Lin X, Zhang B (2021) Stability and folding pathways of tetra-nucleosome from six-dimensional free energy surface. Nat Commun 12:1091.

      [10] Liu S, Lin X, Zhang B (2022) Chromatin fiber breaks into clutches under tension and crowding. Nucleic Acids Research 50:9738–9747.

      [11] Savelyev A, Papoian GA (2010) Chemically accurate coarse graining of doublestranded DNA. Proc. Natl. Acad. Sci. U.S.A. 107:20340–20345.

      [12] Noid WG (2013) Perspective: Coarse-grained models for biomolecular systems. The Journal of Chemical Physics 139:090901.

    1. Author Response

      The following is the authors’ response to the current reviews.

      Response to Reviewer Comments:

      We thank the editors and reviewers for their careful consideration of our revised manuscript. Reviewers 2 and 3 indicated that their previous comments had been satisfactorily addressed by our revisions. Reviewer 1 raised several points and our point by point responses can be found below.

      Reviewer #1 (Recommendations For The Authors):

      1) Please clarify the terminology of spontaneous recovery in your study.

      According to Rescorla RA 2004 ( http://www.learnmem.org/cgi/doi/10.1101/lm.77504.), he defines spontaneous recovery as "with the passage of time following nonreinforcement, there is some "spontaneous recovery" of the initially learned behavior. ". So in this study, I thought Test2 is spontaneous recovery while the Test1 is extinction test as most studies do. But authors seem to define spontaneous recovery from the last trial of Extinction3 to the first trial of Test1, which is confusing to me.

      We agree with the reviewer (and Rescorla, 2004) that spontaneous recovery is defined as the return of the initially learned behaviour after the passage of time. In our study, Test 1 is conducted 24-hours after the final extinction session (Extinction 3) and in our view, the return of responding following that 24-hour delay can be considered spontaneous recovery. Rescorla (2004 and elsewhere) also points out that the magnitude of spontaneous recovery may be greater with larger delays between extinction and testing. This in part motivated our second test 7 days following the last extinction session with optogenetic manipulation. We did not find evidence of greater spontaneous recovery in the test 7 days later, however, the additional extinction trials in Test 1 may have reduced the opportunity to detect such an effect.

      2) Why are E6-8 plots of Offset group in Figure 3E and F different?

      We apologise for this error and have corrected it. This was an artifact of an older version of the figure before final exclusions. The E6-8 data is now the same for panels 2E and 2F.

      3) Related to 2, Please clarify what type of data they are in Figure3E,F Figure5H, and I . If it's average, please add error bars. Also, it's hard to see the statistical significance at the current figure style.

      The data in these panels are the mean lever presses per trial as labeled on the y-axis of the figures. In our view, in this instance, error bars (or lines and other markers of significance) detract from the visual clarity of the figure. The statistical approach and outcomes are included in the figure legend and when presented alongside the figure in the final version of the paper should directly clarify these points.

      Reviewer #2 (Recommendations For The Authors):

      The authors have addressed my previous comments to my satisfaction.

      Reviewer #3 (Recommendations For The Authors):

      The authors have adequately addressed each of the points raised in my original review. The paper will make a nice contribution to the field.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      • It would be interesting if the authors would do calcium imaging or electrophysiology from LCNA neurons during appetitive extinction.

      Indeed these are interesting ideas. We have plans to pursue them but ongoing work is not yet ready for publication.

      • LC-NA neuronal responses during the omission period seem to be important for appetitive extinction as described in the manuscript (Park et al., 2013; Sara et al., 1994; Su & Cohen 2022). It would be nice to activate/inactivate LC-NA neurons during the omission period.

      Optogenetic manipulation was given for the duration of the stimulus (20 seconds; when reward should be expected contingent upon performance of the instrumental response). We believe the reviewer is suggesting briefer manipulation only at the precise time the pellet would have been expected but omitted. If so, the implementation of that is complex because animals were trained on random ratio schedules and so when exactly the pellet(s) was earned was variable and so when precisely the animal experiences “omission” is difficult to know with better temporal specificity than used in the current experiments. But we agree with the reviewer that now we see that there is an effect of LC manipulation, in future studies we could alter the behavioral task so that the timing of reward is consistent (e.g., train the animals with fixed ratio schedules or continuous reinforcement, or use a Pavlovian paradigm) where a reasonable assertion about when the outcome should occur, and thus when its absence would be detected, can be made and then manipulation given at that time to address this point.

      • Does LC-NA optoinhibition affect the expression of the conditioned response (the lever presses at early trials of Extinction 1)? It's hard to see this from the average of all trials.

      The eNpHR group responded numerically less overall during extinction. This effect appears greatest in the first extinction session, but fails to reach statistical significance [F(1,15)= 3.512, p=0.081]. Likewise, analysis of the trial by trial data for the first extinction session failed to reveal any group differences [F(1,15)= 3.512, p=0.081] or interaction [trial x group; F(1,15)=0.550, p=0.470].

      Comparison of responding in the first trial also failed to reveal group differences [F(1.15)=1.209, p=0.289]. Thus while there is a trend in the data, this is not borne out by the statistical analysis, even in early trials of the session.

      • While the authors manipulate global LC-NA neurons, many people find the heterogeneous populations in the LC. It would be great if the authors could identify the subpopulation responsible for appetitive extinction.

      We agree that it would be exciting to test whether and identify which subpopulation(s) of cells or pathway(s) are responsible for appetitive extinction. While related work has found that discrete populations of LC neurons mediate different behaviours and states, and may even have opposing effects, our initial goal was to determine whether the LC was involved in appetitive extinction learning. These are certainly ideas we hope to pursue in future work.

      Minor:

      • Why do the authors choose 10Hz stimulation?

      The stimulation parameters were based on previously published work. We have added these citations to the manuscript.

      Quinlan MAL, Strong VM, Skinner DM, Martin GM, Harley CW, Walling SG. Locus Coeruleus Optogenetic Light Activation Induces Long-Term Potentiation of Perforant Path Population Spike Amplitude in Rat Dentate Gyrus. Front Syst Neurosci. 2019 Jan 9;12:67. doi: 10.3389/fnsys.2018.00067. PMID: 30687027; PMCID: PMC6333706.

      Glennon E, Carcea I, Martins ARO, Multani J, Shehu I, Svirsky MA, Froemke RC. Locus coeruleus activation accelerates perceptual learning. Brain Res. 2019 Apr 15;1709:39-49. doi: 10.1016/j.brainres.2018.05.048. Epub 2018 May 31. PMID: 29859972; PMCID: PMC6274624.

      Vazey EM, Moorman DE, Aston-Jones G. Phasic locus coeruleus activity regulates cortical encoding of salience information. Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):E9439-E9448. doi: 10.1073/pnas.1803716115. Epub 2018 Sep 19. PMID: 30232259; PMCID: PMC6176602.

      • The authors should describe the behavior task before explaining Fig1e-g results.

      We agree that introducing the task earlier would improve clarity and have added a brief summary of the task at the beginning of the results section (before reference to Figure 1) and point the reader to the schematics that summarize training for each experiment (Figures 2A and 4D).

      NOTE R2 includes specific comments in their Public review. We have considered those as their recommendations and address them here.

      1) In such discrimination training, Pavlovian (CS-Food) and instrumental (LeverPress-Food) contingencies are intermixed. It would therefore be very interesting if the authors provided evidence of other behavioural responses (e.g. magazine visits) during extinction training and tests.

      In a discriminated operant procedure, the DS (e.g. clicker) indicates when the instrumental response will be reinforced (e.g., lever-pressing is reinforced only when the stimulus is present, and not when the stimulus is absent). This is distinct from something like a Pavlovianinstrumental transfer procedure and so we wish to just clarify that there is no Pavlovian phase where the stimuli are directly paired with food. After a successful lever-press the rat must enter the magazine to collect the food, but food is only delivered contingency upon lever-pressing and so magazine entries here are not a clear indicator of Pavlovian learning as they may be in other paradigms.

      Nonetheless, we have compiled magazine entry data which although not fully independent of the lever-press response in this paradigm, still tells us something about the animals’ expectation regarding reward delivery.

      For the ChR2 experiment, largely paralleling the results seen in the lever-press data, there were no group differences in magazine responses at the end of training [F(2,40)=2.442, p=0.100].

      Responding decreased across days of extinction (when optogenetic stimulation was given) [F(2, 80)=38.070, p<0.001], but there was no effect of group [F(2,40)=0.801, p=0.456] and no interaction between day and group [F(4,40)=1.461, p=0.222]. Although a similar pattern is seen in the test data, group differences were not statistically different in the first [F(2,40)=2.352, p=0.108] or second [F(2,40)=1.900, p=0.166] tests, perhaps because magazine responses were quite low. Thus, overall, magazine data do not present a different picture than lever-pressing, but because of the lack of statistical effects during testing, we have chosen not to include these data in the manuscript.

      For the eNpHR experiment, again a similar pattern to lever-pressing was seen. There were no group differences at the end of acquisition [F(1,15)=0.290, p=0.598]. Responding decreased across days of extinction [F(2, 30)=4.775, p=0.016] but there was no main effect of group [F(1,15)=1.188, p=0.293], and no interaction between extinction and group [F(2,30)=0.070, p=0.932]. There were no group differences in the number of magazine entries in Test 1 [F(1,15)=1.378, p=0.259] or Test 2 [F(1,15)=0.319, p=0.580].

      Author response image 1.

      Author response image 2.

      2) In Figure 1, the authors show the behavioural data of the different groups of control animals which were later collapsed in a single control group. It would be very nice if the authors could provide the data for each step of the discrimination training.

      We are a little confused by this comment. Figure 1, panels E, F, and G show the different control groups at the end of training, for each day of extinction (when manipulations occurred) and for each test, respectively. It’s not clear if there is an additional step the reviewer is interested in? We note neural manipulation only occurred during extinction sessions.

      We chose to compare the control groups initially, and finding no differences, to collapse them for subsequent analyses as this simplifies the statistical analysis substantially; when group differences are found, each of the subgroups has to be investigated (including the different controls means there are 5 groups instead of 3). It doesn’t change the story because we tested that there were not differences between controls before collapsing them, but collapsing the controls makes the presentation of the statistical data much shorter and easier to follow.

      3) Inspection of Figures 2C & 2D shows that responding in control animals is about the same at test 2 as at the end of extinction training. Therefore, could the authors provide evidence for spontaneous recovery in control animals? This is of importance given that the main conclusion of the authors is that LC stimulation during extinction training led to an increased expression of extinction memory as expressed by reduced spontaneous recovery.

      To address this we have added analyses of trial data, specifically comparison of the final 3 trials of extinction to the subsequent three trials of each test. These analyses are included on page 5 of the manuscript and additional data figures can be found as panels 2E and 2F and pasted below.

      What we observe in the trial data for controls is an increase in responding from the end of extinction to the beginning of each test, thus demonstrating spontaneous recovery. Importantly, responding in the ChR2 group does not increase from the end of extinction to the beginning of the test, illustrating that LC stimulation during extinction prevents spontaneous recovery.

      Comparison of the final three trials of Extinction to the three trials of Test 1:

      Author response image 3.

      Comparison of the final three trials of Extinction to the three trials of Test 2:

      Author response image 4.

      Halorhodopsin Experiment Tests 1 and 2, respectively.

      Author response image 5.

      4) Current evidence suggests that there are differences in LC/NA system functioning between males and females. Could the authors provide details about the allocation of male and female animals in each group?

      More females had surgical complications (excess bleeding) than males resulting in the following allocations; control group; 14 males and 8 females; ChR2 group 8 males and 7 females; offset 6 males.

      In our dataset, we did not detect sex differences in training [no main effect of sex: F(1,38)=1.097, p=0.302, sex x group interaction: F(1,38)= 1.825, p=0.185], extinction [no effect of sex; F(1,38)=0.370, p=0.547; no sex x extinction interaction: F(2,76)=0.701, p=0.499 ; no sex x extinction x group interaction: F(2,76)=2.223, p=0.115] or testing [Test 1 no effect of sex: F(1,38)=1.734, =0.196; no sex x group interaction: F(1,38)=0.009, p=0.924; Test 2 no effect of sex: F(1,38)=0.661, p=0.421; no sex x group interaction: F(1,38)=0.566, p=0.456].

      5) The histology section in both experiments looks a bit unsatisfying. Could the authors provide more details about the number of counted cells and also their distribution along the anteroposterior extent of the LC. Could the authors also take into account the sex in such an analysis?

      The antero-posterior coordinates used for cell counts and calculation of % infection rates were between -9.68 and -10.04 (Paxinos and Watson, 2007, 6th Edition) as infection rates were most consistent in this region and it was well-positioned relative to the optic probe although TH and mCherry positive cells were observed both rostral and caudal to this area. For each animal, an average of ~116+/- 25 TH-positive LC neurons as determined by DAPI and GFP positive cells were identified. Viral expression was identified by colocalized mCherry staining. Animals that did not have viral expression in the LC were not included in the experimental groups. We have added these details to the histology results on page 4.

      Males and females showed very similar infection rates (Males, 74%; Females, 72%). While sex differences, such as total number of LC cells or total LC volume have been reported (Guillamon, A. et al. 2005), Garcia-Falgueras et al. (2005) reported no differences in LC volume or number of LC neurons between male and female Long-Evans rats. So while differences may exist in the LC of Long-Evans rats, the cell counts here were comparable between groups (males, 103 +/- 27; females, 129 +/- 17; t-test, p>0.05).

      References:

      1) Garcia-Falgueras, A., Pinos, H., Collado, P., Pasaro, E., Fernandez, R., Segovia, S., & Guillamon, A. (2005). The expression of brain sexual dimorphism in artificial selection of rat strains. Brain Research, 1052(2), 130–138. https://doi.org/10.1016/j.brainres.2005.05.066

      2) Guillamon, A., De Bias, M. R., & Segovia, S. (1988). Effects of sex steroids on the of the locus coeruleus in the rat. Developmental Brain Research, 40, 306–310.

      Reviewer #3 (Recommendations For The Authors):

      MAJOR

      1) It is worth noting that responding in Group ChR2 decreased from Extinction 3 to Test 1, while responding in the other two groups appears to have remained the same. This suggests that there was no spontaneous recovery of responding in the controls; and, as such, something more must be said about the basis of the between-group differences in responding at test. This is particularly important as each extinction session involved eight presentations of the to-betested stimulus, whereas the test itself consisted of just three stimulus presentations. Hence, comparing the mean levels of performance to the stimulus across its extinction and testing overestimates the true magnitude of spontaneous recovery, which is simply not clear in the results of this study. That is, it is not clear that there is any spontaneous recovery at all and, therefore, that the basis of the difference between Group ChR2 and controls at test is in terms of spontaneous recovery.

      The reviewer is correct that there were a different number of trials in extinction vs. test sessions making direct comparison difficult and displaying the data as averages of the test session does not demonstrate spontaneous recovery per se. To address this we have added analyses of trial data and comparison of the final 3 trials of extinction to the subsequent three trials of each test. These analyses are included on page 5 and 6 of the manuscript and additional data figures can be found as panels 2E and 2F and 4 H and I, and pasted below.<br /> What we observe in the trial data for controls is an increase in responding from the end of extinction to the beginning of each test, thus demonstrating spontaneous recovery. Importantly, responding in the ChR2 group does not increase from the end of extinction to the beginning of the test, illustrating that LC stimulation during extinction prevents spontaneous recovery.

      Comparison of the final three trials of Extinction to the three trials of Test 1:

      Author response image 6.

      Comparison of the final three trials of Extinction to the three trials of Test 2:

      Author response image 7.

      Halorhodopsin Experiment Tests 1 and 2, respectively.

      Author response image 8.

      2a) Did the manipulations have any effect on the rates of lever-pressing outside of the stimulus?

      We did not detect any effect of the optogenetic manipulations on rates of lever pressing outside of the stimulus. This is demonstrated in the pre-CS intervals collected on stimulation days (i.e., extinction sessions) where we see similar response rates between controls and the ChR2 and Offset groups as shown below. There was no effect of group [F(2,40)=0.156, 0.856] or group x extinction day interaction [F(2,40)=0.146, p=0.865].

      Author response image 9.

      2b) Did the manipulations have any effect on rates of magazine entry either during or after the stimulus?

      For the ChR2 experiment, there were no group differences in magazine responses at the end of training [F(2,40)=2.442, p=0.100]. Responding decreased across days of extinction (when optogenetic stimulation was given) [F(2, 80)=38.070, p<0.001], but there was no effect of group [F(2,40)=0.801, p=0.456] and no interaction between day and group [F(4,40)=1.461, p=0.222]. Although a similar pattern is seen in the test data, group differences were not statistically different in the first [F(2,40)=2.352, p=0.108] or second [F(2,40)=1.900, p=0.166] tests, perhaps because magazine responses were quite low. Thus, overall, magazine data do not present a different picture than lever-pressing, but because of the lack of statistical effects during testing, we have chosen not to include these data in the manuscript.

      For the eNpHR experiment, again a similar pattern to lever-pressing was seen. There were no group differences at the end of acquisition [F(1,15)=0.290, p=0.598]. Responding decreased across days of extinction [F(2, 30)=4.775, p=0.016] but there was no main effect of group [F(1,15)=1.188, p=0.293], and no interaction between extinction and group [F(2,30)=0.070, p=0.932]. There were no group differences in the number of magazine entries in Test 1 [F(1,15)=1.378, p=0.259] or Test 2 [F(1,15)=0.319, p=0.580].

      Author response image 10.

      Author response image 11.

      2c) Did the manipulations affect the coupling of lever-press and magazine entry responses? I imagine that, after training, the lever-press and magazine entry responses are coupled: rats only visit the magazine after having made a lever-press response (or some number of leverpress responses). Stimulating the LC clearly had no acute effect on the performance of the lever-press response. If it also had no effect on the total number of magazine entries performed during the stimulus, it would be interesting to know whether the coupling of lever-presses and magazine entries had been disturbed in any way. One could assess this by looking at the jointdistribution of lever-presses (or runs of lever-presses) and magazine visits in each extinction session, or across the three sessions of extinction. As a proxy for this, one could look at the average latency to enter the magazine following a lever-press response (or run of leverpresses). Any differences here between the Controls and Group ChR2 would be informative with respect to the effects of the LC manipulations: that is, the results shown in Figure indicate that stimulating the LC has no acute effects on lever-pressing but protects against something like spontaneous recovery; whereas the results shown in Figure 4 indicate that inhibiting the LC facilitates the loss of responding across extinction without protecting against spontaneous recovery. The additional data/analyses suggested here would indicate whether LC stimulation had any acute effects on responding that might explain the protection from spontaneous recovery; and whether LC inhibition specifically reduced lever-pressing across extinction or whether it had equivalent effects on rates of magazine entry.

      Lever-press and magazine response data were collected trial by trial but not with the temporal resolution required for the analyses suggested by the reviewer. We do not have timestamps for magazine entries nor latency data. We can collect this type of data in future studies. At the session or trial level, magazine entries generally correspond to lever-pressing; being trained on ratio schedules, and from informal observation, rats will do several lever-presses and then check the magazine. Rates of each decrease across extinction (magazine data included in response to comment 2b. above). Optogenetic manipulation appeared to have no immediate effect on either response during extinction.

      ROCEDURAL

      1) Why were there three discriminative stimuli in acquisition: a light, white noise, and clicker?

      This was done to be consistent with and apply parameters similar to previous, related studies (Rescorla, 2006; Janak & Corbit, 2011) and to allow comparison to potential future studies that may involve stimulus compounds etc. (requiring training of multiple stimuli).

      2) Why were some rats extinguished to the noise while others were extinguished to the clicker? Were the effects of LC stimulation/inhibition dependent on the identity of the extinguished stimulus?

      Because the animals were trained with multiple stimuli, it allowed us some ability to choose amongst those stimuli to best balance response rates across groups before the key manipulations. The effects of LC manipulation did not differ between animals based on the identity of the extinguished stimulus.

      3) Did the acute effects of LC inhibition on extinction vary as a function of the stimulus identity?

      No

      4) Was the ITI in extinction the same as that in acquisition?

      Yes, the ITI was the same for acquisition and extinction sessions (variable, averaging to 90 seconds). We have added a sentence to the methods (p. 11) to reflect this.

      5) For Group Offset, when was the photo-stimulation applied in relation to the extinguished stimulus: was it immediately upon offset of the stimulus or at a later point in the ITI?

      The group label “Offset” was used to be consistent with Umaetsu et al. (2017) that delivered stimulation 50-70s after a trial. SImilarly, we mean it as discontinuous with the stimulus, not at the termination of the stimulus. We have revised the description of this group on page 11 to clarify the timing of the photostimulation as follows:

      “Animals in the Offset group (and relevant controls) underwent identical training with the exception that stimulation in extinction sessions occurred in the middle of the variable length ITI (45s after stimulus termination, on average).”

      MINOR

      1) "Such recovery phenomena undermine the success of extinction-based therapies..."

      ***Perhaps a different phrasing is needed here: "These phenomena show that extinction-based therapies are not always effective in suppressing an already-established response..."

      We have revised this sentence in line with the reviewer’s suggestion:

      “These phenomena mean that extinction-based therapies are not always successful in suppressing previously-established behaviours” (first paragraph of the introduction).

      2) Typo in para 1 of results: "F(2,19)=0.0.352"

      Thank you for finding this typo. It has been corrected. (p.4)

      3) "As another example of modular functional organization, no improvements to strategy setshifting following global LC stimulation, but improvements were observed when LC terminals in the medial prefrontal cortex were targeted (Cope et al., 2019)." ***This sentence is missing a "there were" before "no improvements".

      Thank you for finding this error. It has been corrected. (p.8)

    2. eLife assessment

      In this important study, Lui and colleagues examine whether the locus coeruleus is involved in extinction of an appetitive conditioned response. Using a set of optogenetic approaches aimed at manipulating the activity of locus coeruleus cells, the authors provide solid evidence that these neurons regulate the extinction of conditioned responses. Overall this study further highlights the key role of noradrenaline in cognitive processes and will be of interest to those interested in associative learning, extinction, noradrenaline, associated brain systems and translational endpoints.

    3. Reviewer #1 (Public Review):

      In this paper by Lui and colleagues, the authors examine the role of locus coeruleus (LC)-noradrenaline (NA) neurons in the extinction of appetitive instrumental conditioning. They report that optogenetic activation of global LC-NA neurons during the conditioned stimulus (CS) period of extinction enhances long-term extinction memory without affecting within-session extinction. In contrast, LC-NA activation during the intertrial interval doesn't affect extinction and long-term memory. They then show that optogenetic activation of LC-NA neurons doesn't induce conditioned place preference/avoidance. Finally, they assess the necessity of LC-NA neurons in appetitive extinction and find that optogenetic inactivation of LC-NA neurons during CS period results in enhancement of within-session extinction. The experiments are well-designed, including offset control in the optogenetic activation study. I think this study adds new insight into the LC-NA system in the context of appetitive extinction.

      Strength:<br /> ・These studies identify the artificial activation of LC-NA neurons enhances long-term memory of appetitive extinction while this activation can't induce long-term conditioned place aversion. Thus, optogenetic activation of LC-NA neurons can inhibit spontaneous recovery of appetitive extinction without causing long-term aversive memory.<br /> ・Optoinhibition study demonstrates the reduction of conditioned response of within-session extinction. Therefore, LC-NA neuronal activity at the CS period of extinction could act as anti-extinction or be important for the expression of conditioned response.

      Weakness:<br /> ・It is unclear how LC-NA neurons behave during the CS period of appetitive extinction from this study. This weakens the importance of the optogenetic inactivation result.<br /> ・While authors manipulate global LC-NA neurons, many people find functionally heterogeneous populations in the LC. It remains unsolved if there is specific LC-NA subpopulation responsible for appetitive extinction.

    4. Reviewer #2 (Public Review):

      Understanding how the LC/noradrenaline system controls basic cognitive processes is important and timely. This study aims to understand the role Locus Coeurelus /noradrenaline system in extinction of conditioned responding. The authors used a discriminative appetitive procedure to show that photoexcitation of noradrenergic neurons of the Locus Coeruleus has no effect on the performance during extinction but impacts expression of extinguished responding through a decreased spontaneous recovery. This study is appropriately designed and the results are well analysed. Therefore, it provides an important and timely addition to the field

    5. Reviewer #3 (Public Review):

      The introduction/background is excellent. It reviews evidence showing that extinction of conditioned responding is regulated by noradrenaline and suggesting that the locus coeruleus (LC) may be a critical locus of this regulation. This naturally leads to the aim of the study: to determine whether the locus coeruleus is involved in extinction of an appetitive conditioned response. Overall, the study is well designed, nicely conducted and the results advance our understanding of the role of the LC in extinction of conditioned behaviour. Future studies may provide more fine-grained analyses of behavioral data to clarify the impact of the LC manipulations (stimulation and inhibition) on performance in the task.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformation-specific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:

      The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

      Weakness:

      Though written with good clarity, the paper will benefit from some clarifications.

      1) The angular distribution of particles for the 3D reconstructions should be provided (Figure 1 - Sup. 1 & Sup. 2).

      The supplementary figures will be adapted to include particle distribution plots.

      2) The B-factors for protein and ligand of the model, Map sharpening factor, and molprobity score should be provided (Table 1).

      The map used to interpret the model was post-processed by density modification, therefore no sharpening factor was obtained. This information will be included in Table 1, together with B-factors and molprobity scores.

      3) A supplemental Figure to Figure 2B, illustrating how a0-helix interacts with COR-A&LRR before and after GTP binding in atomic details, will be helpful for the readers to understand the critical role of a0-helix during CtRoco activation.

      A supplemental figure will be prepared to illustrate this in the revised document.

      4) For the following statement, "On the other hand, only relatively small changes are observed in the orientation of the Roc a3 helix. This helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022), is located at the interface of the Roc and CORB domains and harbors the residues H554 and Y558, orthologous to the LRRK2 PD mutation sites N1337 and R1441, respectively."

      It is not surprising the a3-helix of the ROC domain only has small changes when the ROC domain is aligned (Figure 2E). However, in the study by Zhu et al (DOI: 10.1126/science.adi9926), it was shown that a3-helix has a "see-saw" motion when the COR-B domain is aligned. Is this motion conserved in CtRoco from inactive to active state?

      We indeed describe the conformational changes from the perspective of the Roc domain. When using the COR-B domain for structural alignment, a rotational movement of Roc (including a “seesaw”-like movement of the α3-helix helix around His554) with respect to COR-B is correspondingly observed. We will include this in the revised document.

      5) A supplemental figure showing the positions of and distances between NbRoco1 K91 and Roc K443, K583, and K611 would help the following statement. "Also multiple crosslinks between the Nbs and CtRoco, as well as between both nanobodies were found. ... NbRoco1-K69 also forms crosslinks with two lysines within the Roc domain (K583 and K611), and NbRoco1-K91 is crosslinked to K583".

      A provisional figure displaying these crosslinks is already provided below, and we will also consider including this in the revised manuscript. However, in interpreting these crosslinks it should be taken into consideration that the additive length of the DSSO spacer and the lysine side chains leads to a theoretical upper limit of ∼26 Å for the distance between the α carbon atoms of cross-linked lysines (and even a cut-off distance of 35 Å when taking into account protein dynamics).

      Author response image 1.

      6) It would be informative to show the position of CtRoco-L487 in the NF and GTP-bound state and comment on why this mutation favors GTP hydrolysis.

      We will create an additional figure showing the position of L487, and discuss possible mechanisms for the observed effect of a mutation on GTPase activity.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomer-GTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:

      The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

      Weakness:

      The strong point of the use of nanobodies is also a potential weak point; these nanobodies may have induced some conformational changes in a part of the protein that will not be present in a GTPyS-bound protein in the absence of nanobodies.

      Two major points need further attention.

      1) Several parts of the protein are very flexible during the monomer-dimer activity cycle. This flexibility is crucial for protein function, but obviously hampers structure resolution. Forced experiments to reduce flexibility may allow better structure resolution, but at the same time may impede the activation cycle. Therefore, careful experiments and interpretation are very critical for this type of work. This especially relates to the influence of the nanobodies on the structure that may not occur during the "normal" monomer-dimer activation cycle in the absence of the nanobodies (see also point 2). So what is the evidence that the nanobody-bound GTPyS-bound state is biochemically a reliable representative of the "normal" GTP-bound state in the absence of nanobodies, and therefore the obtained structure can be confidentially used to interpret the activation mechanism as done in the manuscript.

      See below for an answer to remark 1 and 2.

      2) The obtained structure with two nanobodies reveals that the nanobodies NbRoco1 and NbRoco2 bind to parts of the protein by which a dimer is impossible, respectively to a0-helix of the linker between Roc-COR and LRR, and to the cavity of the LRR that in the dimer binds to the dimerizing domain CORB. It is likely the open monomer GTP-bound structure is recognized by the nanobodies in the camelid, suggesting that overall the open monomer structure is a true GTP-bound state. However, it is also likely that the binding energy of the nanobody is used to stabilize the monomer structure. It is not automatically obvious that in the details the obtained nonobody-Roco-GTPyS structure will be identical to the "normal" Roco-GTPyS structure. What is the influence of nanobody-binding on the conformation of the domains where they bind; the binding energy may be used to stabilize a conformation that is not present in the absence of the nanobody. For instance, NbRoco1 binds to the a0 helix of the linker; what is here the "normal" active state of the Roco protein, and is e.g. the angle between RocCOR and LRR also rotated by 135 degrees? Furthermore, nanobody NbRoco2 in the LRR domain is expected to stabilize the LRR domain; it may allow a position of the LRR domain relative to the rest of the protein that is not present without nanobody in the LRR domain. I am convinced that the observed open structure is a correct representation of the active state, but many important details have to be supported by e,g, their CX-MS experiments, and in the end probably need confirmation by more structures of other active Roco proteins or confirmation by a more dynamic sampling of the active states by e.g. molecular dynamics or NMR.

      Recently, nanobodies have increasingly been used successfully to obtain structural insights in protein conformational states (reviewed in Uchański et al, Curr. Opin. Struc. Biol. 2020). As reviewer # 2 points out, the concern is sometimes raised that antibodies could distort a protein into non-native conformations. Here, it is important to note that the nanobodies were raised by immunizing a llama with the fully native CtRoco protein bound to a non-hydrolysable GTP analogue, after which the nanobodies were selected by phage display using the same fully native and functional form of the protein. As clearly explained in Manglik et al. Annu Rev Pharmacol Toxicol. 2017, the probability of an in vivo matured nanobody inducing a non-native conformation of the antigen is low, although it is possible that it selects a high-energy, low-population conformation of a dynamic protein. Immature B cells require engagement of displayed antibodies with antigen to proliferate and differentiate during clonal selection. Antibodies that induce non-native conformations of the antigen pay a substantial energetic penalty in this process, and B cell clones displaying such antibodies will have a significantly lower probability of proliferation and differentiation into mature antibody-secreting B lymphocytes. Hence, many recent experiments and observation give credence to the notion that nanobodies bind antigens primarily by conformational selection and not induced fit (e.g. Smirnova et al. PNAS 2015).

      Extrapolated to the case of CtRoco, which is clearly very flexible in its GTP-bound form, this means that the nanobodies are able to trap and stabilize one conformational state that is representative of the “active state” ensemble of the protein. In this respect, it is clear from our experiments (XL-MS, affinity and effect on GTPase activity) that the effects of NbRoco1 and NbRoco2 are additive (or even cooperative), meaning that both nanobodies recognize different features of the same CtRoco “active state”. Correspondingly, the monomeric, elongated “open” conformation is also observed in the structure of CtRoco bound to NbRoco1 only (Figure1 - supplement 2), albeit that this structure still displays more flexibility. The monomerization and conformational changes that we observe and describe in the current paper at high resolution are also in very good agreement with earlier observations for CtRoco in the GTP-bound form in absence of any nanobodies, including negative stain EM (Deyaert et al. Nature Commun, 2017), hydrogen-deuterium exchange experiments (Deyaert et al. Biochem. J. 2019) and native MS (Leemans et al. Biochem J. 2020).

      In the revised document we will include some additional text to address and clarify these aspects.

    2. eLife assessment

      ROCO proteins are evolutionarily conserved GTPases characterized by the presence of a tandem "COR" domain, sometimes accompanied by a kinase domain as in the LRRK2 protein that is linked to neurodegeneration. Previously the authors have shown that two conformational nanobodies can be used to trap a ROCO protein CtRoco in a monomer-GTPyS-bound state. The high-resolution structural data here provides convincing insights into the active state conformation of CtRoco, an important initial advance towards a broader mechanistic understanding of these cryptic tandem-domain proteins.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformation-specific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:<br /> The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

      Weakness:<br /> Though written with good clarity, the paper will benefit from some clarifications.

      1. The angular distribution of particles for the 3D reconstructions should be provided (Figure 1 - Sup. 1 & Sup. 2).

      2. The B-factors for protein and ligand of the model, Map sharpening factor, and molprobity score should be provided (Table 1).

      3. A supplemental Figure to Figure 2B, illustrating how a0-helix interacts with COR-A&LRR before and after GTP binding in atomic details, will be helpful for the readers to understand the critical role of a0-helix during CtRoco activation.

      4. For the following statement, "On the other hand, only relatively small changes are observed in the orientation of the Roc a3 helix. This helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022), is located at the interface of the Roc and CORB domains and harbors the residues H554 and Y558, orthologous to the LRRK2 PD mutation sites N1337 and R1441, respectively."<br /> It is not surprising the a3-helix of the ROC domain only has small changes when the ROC domain is aligned (Figure 2E). However, in the study by Zhu et al (DOI: 10.1126/science.adi9926), it was shown that a3-helix has a "see-saw" motion when the COR-B domain is aligned. Is this motion conserved in CtRoco from inactive to active state?

      5. A supplemental figure showing the positions of and distances between NbRoco1 K91 and Roc K443, K583, and K611 would help the following statement. "Also multiple crosslinks between the Nbs and CtRoco, as well as between both nanobodies were found. ... NbRoco1-K69 also forms crosslinks with two lysines within the Roc domain (K583 and K611), and NbRoco1-K91 is crosslinked to K583".

      6. It would be informative to show the position of CtRoco-L487 in the NF and GTP-bound state and comment on why this mutation favors GTP hydrolysis.

    4. Reviewer #2 (Public Review):

      Summary<br /> The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomer-GTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:<br /> The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

      Weakness:<br /> The strong point of the use of nanobodies is also a potential weak point; these nanobodies may have induced some conformational changes in a part of the protein that will not be present in a GTPyS-bound protein in the absence of nanobodies.

      Two major points need further attention.

      1. Several parts of the protein are very flexible during the monomer-dimer activity cycle. This flexibility is crucial for protein function, but obviously hampers structure resolution. Forced experiments to reduce flexibility may allow better structure resolution, but at the same time may impede the activation cycle. Therefore, careful experiments and interpretation are very critical for this type of work. This especially relates to the influence of the nanobodies on the structure that may not occur during the "normal" monomer-dimer activation cycle in the absence of the nanobodies (see also point 2). So what is the evidence that the nanobody-bound GTPyS-bound state is biochemically a reliable representative of the "normal" GTP-bound state in the absence of nanobodies, and therefore the obtained structure can be confidentially used to interpret the activation mechanism as done in the manuscript.

      2. The obtained structure with two nanobodies reveals that the nanobodies NbRoco1 and NbRoco2 bind to parts of the protein by which a dimer is impossible, respectively to a0-helix of the linker between Roc-COR and LRR, and to the cavity of the LRR that in the dimer binds to the dimerizing domain CORB. It is likely the open monomer GTP-bound structure is recognized by the nanobodies in the camelid, suggesting that overall the open monomer structure is a true GTP-bound state. However, it is also likely that the binding energy of the nanobody is used to stabilize the monomer structure. It is not automatically obvious that in the details the obtained nonobody-Roco-GTPyS structure will be identical to the "normal" Roco-GTPyS structure. What is the influence of nanobody-binding on the conformation of the domains where they bind; the binding energy may be used to stabilize a conformation that is not present in the absence of the nanobody. For instance, NbRoco1 binds to the a0 helix of the linker; what is here the "normal" active state of the Roco protein, and is e.g. the angle between RocCOR and LRR also rotated by 135 degrees? Furthermore, nanobody NbRoco2 in the LRR domain is expected to stabilize the LRR domain; it may allow a position of the LRR domain relative to the rest of the protein that is not present without nanobody in the LRR domain. I am convinced that the observed open structure is a correct representation of the active state, but many important details have to be supported by e,g, their CX-MS experiments, and in the end probably need confirmation by more structures of other active Roco proteins or confirmation by a more dynamic sampling of the active states by e.g. molecular dynamics or NMR.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS. That said, there are some points where I feel the authors could have taken their care a bit further and, as a result, inform the community even more about what is in their data.

      We thank the reviewer for this globally positive evaluation of our work, and we appreciate the advices to improve our manuscript.

      The introduction sets up the contrast of 'ecological' (mostly foraging) and social variables of a primate's life that can be reflected in the relative size of brain regions. This debate is for a large part a relic of the literature and the authors themselves state in a number of places that perhaps the contrast is a bit artificial. I feel that they could go further in this. Social behavior could easily be a solution to foraging problems, making them variables that are not in competition, but simply different levels of explanation. This point has been made in some of the recent work by Robin Dunbar and Susanne Shultz.

      Thank you for this constructive comment, and we acknowledge that the contrast between social vs ecological brain is relatively marginal here. Based also on the first remark by reviewer 3, we have reformulated the introduction to emphasize what we think is actually more critical: the link between cognitive functions as defined in laboratory conditions and socio-ecological variables measured in natural conditions. And the fact that here, we use brain measures as a potential tool to relate these laboratory vs natural variables through a common scenario. Also, we were already mentioning the potential interaction between social and foraging processes in the discussion, but we are happy to add a reference to recent studies by S. Shultz and R. Dunbar (2022), which is indeed directly relevant. We thank the reviewer for pointing out this literature.

      In a similar vein, the hypotheses of relating frontal pole to 'meta-cognition' and dorsolateral PFC to 'working memory' is a dramatic oversimplification of the complexity of cognitive function and does a disservice to the careful approach of the rest of the manuscript.

      We agree that the formulation of which functions we were attributing to the distinct brain regions might not have been clear enough, but the functional relation between frontal pole and metacognition in the one hand, and DLPFC and working memory on the other hand, have been firmly established in the literature, both through laboratory studies and through clinical data. Clearly, no single brain region is necessary and sufficient for any cognitive operation, but decades of neuropsychology have demonstrated the differential implication of distinct brain regions in distinct functions, which is all we mean here. We have made a specific point on that topic in the discussion (cf p. 16). We have also reformulated the introduction to clarify that, even if the relation between these regions and their functions (FP/ metacognition; DLPFC/ working memory) was clear in laboratory conditions, it was not clear whether this mapping could be used for real life conditions. And therefore whether that simplification was somehow justified beyond the lab (and the clinics), and whether these neuro-cognitive concepts could be applied to natural conditions, are indeed critical questions that we wanted to address. The central goal of the present study was precisely to evaluate the extent to which this brain/cognition relation could be used to understand more natural behaviors and functions, and we hope that it appears more clearly now.

      One can also question the predicted relationship between frontal pole meta-cognition and social abilities versus foraging, as Passingham and Wise show in their 2012 book that it is frontal pole size that correlates with learning ability-an argument that they used to relate this part of the brain to foraging abilities. I would strongly suggest the authors refrain from using such descriptive terms. Why not simply use the names of the variables actually showing significant correlations with relative size of the areas?

      We basically agree with the reviewer, and we acknowledge the lack of clarity in the introduction of the previous manuscript. There were indeed lots of ambiguity in what we were referring to as ‘function’, associated with a given brain region. « Function » referred to way to many things! We have reformulated the introduction not only to clarify the different types of functions that were attributed to distinct brain regions in the literature but also to clarify how this study was addressing the question: by trying to articulate concepts from neuroscience laboratory studies with concepts from behavioral ecology and evolution using intuitive scenarios. We hope that the present version of the introduction makes that point clearer.

      The major methodological judgements in this paper are of course in the delineation of the frontal pole and dorsolateral prefrontal cortex. As I said above, I appreciate how carefully the authors describe their anatomical procedure, allowing researchers to replicate and extend their work. They are also careful not to relate their regions of interest to precise cytoarchitectonic areas, as such a claim would be impossible to make without more evidence. That said, there is a judgement call made in using the principal sulcus as a boundary defining landmark for FP in monkeys and the superior frontal sulcus in apes. I do not believe that these sulci are homologous. Indeed, the authors themselves go on to argue that dorsolateral prefrontal cortex, where studied using cytoarchitecture, stretches to the fundus of principal sulcus in monkeys, but all the way to the inferior frontal sulcus in apes. That means that using the fundus of PS is not a good landmark.

      We thank the reviewer for his kind remarks on our careful descriptions. But then, it is not clear whether our choice of using the principal sulcus as a boundary for FP in monkeys vs the superior frontal sulcus in apes is actually a judgement call. First, and foremost, there is no clear and unambiguous definition of what should be the boundaries of the FP. By contrast with cytoarchitectonic maps, but clearly this is out of reach here. In humans and great apes we used Bludau et al 2014 (i.e. sup frontal sulcus), and in monkeys, we chose a conservative landmark that eliminated area 9, which is traditionally associated with the DLPFC (Petrides, 2005; Petrides et al, 2012; Semendeferi et al, 2001).

      Of course, any definition will attract criticism, so the best solution might be to run the analysis multiple times, using different definitions for the areas, and see how this affects results.

      Indeed, functional maps indicate that dorsal part of anterior PFC in monkeys is functionally part of FP. But again, cytoarchitectonic maps also indicate that this part of the brain includes BA 9, which is traditionally associated with DLPFC (Petrides, 2005; Petrides et al, 2012). As already pointed out in the discussion, there is a functional continuum between FP and DLPFC and our goal when using PS as dorsal border was to be very conservative and to exclude the ambiguous area. But we agree with the reviewer that given that this decision is arbitrary, it was worth exploring other definitions of the FP volume. So, we did complete a new analysis with a less conservative definition of the FP, to include this ambiguous dorsal area, and it is now included in the supplementary material. Maybe as expected, including the ambiguous area in the FP volume shifted the relation with socio-ecological variables towards the pattern displayed by the DLPFC (ie the influence of population density decreased). The most parsimonious interpretation of this results is that when extending the border of the FP region to cover a part of the brain which might belong to the DLPFC, or which might be somehow functionally intermediate between the 2, the specific relation of the FP with socio-ecological variables decreases. Thus, even if we agree that it was important to conduct this analysis, we believe that it only confirms the difficulty to identify a clear boundary between FP and DLPFC. Again, we have clearly explained throughout the manuscript that we admit the lack of precision in our definitions of the functional brain regions. In that frame, the conservative option seems more appropriate and for the sake of clarity, the results of the additional analysis of a FP volume that includes the ambiguous area is only included in the supplementary material.

      If I understand correctly, the PGLS was run separately for the three brain measure (whole brain, FP, DLPFC). However, given that the measures are so highly correlated, is there an argument for an analysis that allows testing on residuals. In other words, to test effects of relative size of FP and DLPFC over and above brain size?

      Generally, using residuals as “data” (or pseudo-data) is not recommended in statistical analyses. Two widely cited references from the ecological literature are:

      Garcia-Berthou E. (2001) On the Misuse of Residuals in Ecology: Testing Regression Residuals vs. the Analysis of Covariance. Journal of Animal Ecology, 70 (4): 708-711.

      Freckleton RP. (2002). On the misuse of residuals in ecology: regression of residuals vs. multiple regression. Journal of Animal Ecology 71: 542–545. https://doi.org/10.1046/ j.1365-2656.2002.00618.x.

      The main reason for this recommendation is that residuals are dependent on the fitted model, and thus on the particular sample under consideration and the eventual significant effects that can be inferred.

      In the discussion and introduction, the authors discuss how size of the area is a proxy for number of neurons. However, as shown by Herculano-Houzel, this assumption does not hold across species. Across monkeys and apes, for instance, there is a different in how many neurons can be packed per volume of brain. There is even earlier work from Semendeferi showing how frontal pole especially shows distinct neuron-to-volume ratios.

      We appreciate the reviewer’s comment, but the references to Herculano-Houzel that we have in mind do indicate that the assumption is legitimate within primates.

      Herculano-Houzel et al (2007) show that the neuronal density of the cortex is well conserved across primate species (but only monkeys were studied). The conclusion of that study is that using volumes as a proxy for number of neurons, as a measure of computational capacity, should be avoided between rodents and primates (and as they showed later, even more so with birds, for which neuronal density is higher). BUT within primates, since neuronal densities are conserved, volume is a good predictor of number of neurons. Gabi et al (2016) provide evidence that the neuronal density of the PFC is well conserved between humans and non-human primates, which implies that including humans and great apes in the comparison is legitimate. In addition, the brain regions included in the analysis presumably include very similar architectonic regions (e.g. BA 10 for FP, BA 9/46 for DLPFC), which also suggests that the neuronal density should be relatively well conserved across species. Altogether, we believe that there is sufficient evidence to support the idea that the volume of a PFC region in primates is a good proxy for the number of neurons in that region, and therefore of its computational capacity.

      Semendeferi and colleagues (2001) pointed out some differences in cytoarchitectonic properties across parts of the FP and discussed how these properties could 1) be used to identify area 10 across species 2) be associated with distinct computational properties, with the idea that thicker ‘cell body free’ layers would leave more space for establishing connections (across dendrites and axons). This pioneering work, together with more recent imaging studies on functional connectivity (e.g. Sallet et al, 2013) emphasize the critical contribution of connectivity pattern as a tool for comparative anatomy. But unfortunately, as pointed out in the discussion already, this is currently out of reach for us.

      We acknowledge the limitations, and to be fair, the notion of computational capacity itself is hard to define operationally. Based on the work of Herculano-Houzel et al, average density is conserved enough across primates (including humans) to justify our approximation. We have tried to define our regions of interest using both anatomical and functional maps and, thanks to the reviewer’s suggestions, we even tried several ways to segment these regions. Functional maps in macaques and humans do not exactly match cytoarchitectonic maps, presumably because functions rely not only upon the cytoarchitectonics but also on connectivity patterns (e.g. Sallet et al, 2013).

      In sum, we appreciate the reviewer’s point but feel that, given the current understanding of brain functions and the relative conservation of neuronal density across primate PFC regions, the volume of a PFC region seems to be reasonable proxy for its number of neurons, and therefore its computational capacity. We have added these points to the discussions, and we hope that the reader will be able to get a fair sense of how legitimate is that position, given the literature.

      Overall, I think this is a very valuable approach and the study demonstrates what can now be achieved in evolutionary neuroscience. I do believe that they authors can be even more thorough and precise in their measurements and claims.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience.

      We must not have been clear enough in our manuscript, because our goal is precisely not to separate humans from other primates. This is why, in contrast to other studies, we have included human and non-human primates in the same models. If our goal had been to study human evolution, we would have included fossil data (endocasts) from the human lineage.

      But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We admit that lissencephalic species could not be included in this study because we use sulci as key landmarks. We believe that including lissencephalic primates would have introduced a bias and noise in our comparisons, as the delimitations and landmarks would have been different for gyrencephalic and lissencephalic primates. Concerning development, it is simply beyond the scope of our study.

      Major comments.

      1) Is the brain modular? Is there modularity in brain evolution?: The entire manuscript is organized around the idea that the brain is a mosaic of units that have separate evolutionary trajectories:

      "In terms of evolution, the functional heterogeneity of distinct brain regions is captured by the notion of 'mosaic brain', where distinct brain regions could show a specific relation with various socio-ecological challenges, and therefore have relatively separate evolutionary trajectories".

      This hypothesis is problematic for several reasons. One of them is that each evolutionary module of the brain mosaic should originate in embryological development from a defined progenitor (or progenitors) domain [see García-Calero and Puelles (2020)]. Also, each evolutionary module should comprise connections with other modules; in the present case, FP and DLPFC have not evolved alone but in concert with, at least, their corresponding thalamic nuclei and striatal sector. Did those nuclei and sectors also expand across the selected primate species? Can the authors relate FP and DLPFC expansion to a shared progenitor domain across the analyzed species? This would be key to proposing homology hypotheses for FP and DLPFC across the selected species. The authors use all the time the comparative approach but never explicitly their criteria for defining homology of the cerebral cortex sectors analyzed.

      We do not understand what the referee is referring to with the word ‘module’, and why it relates to development. Same thing for the anatomical relation with subcortical structures. Yes, the identity of distinct functional cortical regions relies upon subcortical inputs during development, but clearly this is neither technically feasible, nor relevant here anyways.

      We acknowledge, however, that our definition of functional regions was not precise enough, and we have updated the introduction to clarify that point. In short, we clearly do not want to make a strong case for the functional borders that we chose for the regions of interest here (FP and DLPFC), but rather use those regions as proxies for their corresponding functions as defined in laboratory conditions for a couple of species (rhesus macaques and humans, essentially).

      Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support.

      Once again, we do not think that our definition of modules matches what the reviewer has in mind, i.e. modules defined by populations of neurons that developed together (e.g. visual thalamic neurons innervating visual cortices, themselves innervating visual thalamic neurons). Rather, the notion of mosaic brain refers to the fact that different parts of the brain are susceptible to distinct (but not necessarily exclusive) sources of selective pressures. The extent to which these ‘developmental’ modules are related to ‘evolutionary’ modules is clearly beyond the scope of this paper.

      Our goal here was to evaluate the extent to which modules that were defined based on cognitive operations identified in laboratory conditions could be related (across species) to socio-ecological factors as measured in wild animals. Again, we agree that the way these modules/ functional maps were defined in the paper were confusing, and we hope that the new version of the manuscript makes this point clearer.

      Also, the authors refer most of the time to brain regions, which is confusing because they are analyzing cerebral cortex regions.

      We do not understand why the term ‘brain’ is more confusing than ‘cerebral cortex’, especially for a wide audience.

      2) Definition and delimitation of FP and DLPFC: The precedent questions are also related to the definition and parcellation of FP and DLPFC. How homologous cortical sectors are defined across primate species? And then, how are those sectors parcellated?

      The authors delimited the FP:

      "...according to different criteria: it should match the functional anatomy for known species (macaques and humans, essentially) and be reliable enough to be applied to other species using macroscopic neuroanatomical landmarks".

      There is an implicit homology criterion here: two cortical regions in two primate species are homologs if these regions have similar functional anatomy based on cortico-cortical connections. Also, macroscopic neuroanatomical landmarks serve to limit the homologs across species.

      This is highly problematic. First, because similar function means analogy and not necessarily homology [for further explanation see Puelles et al. (2019); García-Cabezas et al. (2022)].

      We are not sure to follow the Reviewer’s point here. First, it is not clear what would be the evolutionary scenario implied by this comment (evolutionary divergence followed by reversion leading to convergence?). Second, based on the literature, both the DLPFC and the FP display strong similarities between macaques and humans, in terms of connectivity patterns (Sallet et al, 2013), in terms of lesion-induced deficit and in terms of task-related activity (Mansouri et al, 2017). These criteria are usually sufficient to call 2 regions functionally equivalent. We do not see how this explanation is "highly problematic" as it is clearly the most parsimonious based on our current knowledge.

      Second, because there are several lissencephalic primate species; in these primates, like marmosets and squirrel monkeys, the whole approach of the authors could not have been implemented. Should we suppose that lissencephalic primates lack FP or DLPFC?

      We understand neither the reviewer’s logic, nor the tone. We understand that the reviewer is concerned by the debate on whether some laboratory species are more relevant than others for studying the human prefrontal cortex, but this is clearly not the objective of our work. As explained in the manuscript, we identified FP and DLPFC based on functional maps in humans and laboratory monkeys (macaques), and we used specific gyri as landmarks that could be reliably used in other species. And, as rightfully pointed out by reviewer 1, this is in and off itself not so trivial. Of course, lissencephalic animals could not be studied because we could not find these landmarks, but why would it mean that they do not have a prefrontal cortex? The reviewer implies that species that we did not study do not have a prefrontal cortex, which makes little sense. Standards in the field of comparative anatomy of the PFC, especially when it implies rodents (lissencephalic also) include cytoarchitectonic and connectivity criteria, but obviously we are not in a position to address it here. We have, however, included references to the seminal work of Angela Roberts and collaborator in the discussion on marmosets prefrontal functions, to reinforce the idea that the functional organization is relatively well conserved across all primates (with or without gyri on their brain) (Dias et al, 1996; Roberts et al, 2007).

      Do these primates have significantly more simplistic ways of life than gyrencephalic primates? Marmosets and squirrel monkeys have quite small brains; does it imply that they have not experience the influence of socio-ecological factors on the size of FP, DLPFC, and the rest of the brain?

      Again, none of this is relevant here, because we could not draw conclusions on species that we cannot study for methodological reasons. The reviewer seems to believe that an absence of evidence is equivalent to an evidence of absence, but we do not.

      The authors state that:

      "the strong development of executive functions in species with larger prefrontal cortices is related to an absolute increase in number of neurons, rather than in an increase in the ration between the number of neurons in the PFC vs the rest of the brain".

      How does it apply to marmosets and squirrel monkeys?

      Again, we do not understand the reviewer’s point, since it is widely admitted that lissencephalic monkeys display both a prefrontal cortex and executive functions (again, see the work of Angela Roberts cited above). Our goal here was certainly not to get into the debate of what is the prefrontal cortex in a handful of laboratory species, but to evaluate the relevance of laboratory based neuro-cognitive concepts for understanding primates in general, and in their natural environment.

      References:

      García-Cabezas MA, Hacker JL, Zikopoulos B (2022) Homology of neocortical areas in rats and primates based on cortical type analysis: an update of the Hypothesis on the Dual Origin of the Neocortex. Brain structure & function Online ahead of print. doi:doi.org/ 10.1007/s00429-022-02548-0

      García-Calero E, Puelles L (2020) Histogenetic radial models as aids to understanding complex brain structures: The amygdalar radial model as a recent example. Front Neuroanat 14:590011. doi:10.3389/fnana.2020.590011

      Nieuwenhuys R, Puelles L (2016) Towards a New Neuromorphology. doi:10.1007/978-3-319-25693-1

      Puelles L, Alonso A, Garcia-Calero E, Martinez-de-la-Torre M (2019) Concentric ring topology of mammalian cortical sectors and relevance for patterning studies. J Comp Neurol 527 (10):1731-1752. doi:10.1002/cne.24650

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis). My comments are organized by section below:

      We thank the reviewer for the globally positive evaluation and for the constructive remarks. Introduction:

      • Well written and thorough, but the questions presented could use restructuring.

      Again, we thank the reviewer, and we believe that this is coherent with some of the remarks of reviewer 1. We have extensively revised the introduction, toning down the social vs ecological brain issue to focus more on what is the objective of the work (evaluating the relevance of lab based neuro-cognitive concepts for understanding natural behavior in primates).

      Methods:

      • It is unclear which combinations of models were compared or why only population density and distance travelled tested appear to have been included.

      The details of the model comparison analysis were presented as a table in the supplementary material (#3, details of the model comparison data), but we understand that this was not clear enough. We have provided more explanation both in the main manuscript and in the supplements. All variables were considered a priori; however, we proceeded beforehand to an exploratory analyses which led us to exclude some variables because of their lack of resolution (not enough categories for qualitative variables) or strong cross-correlations with other quantitative variables. There were much more than three variables included in the models but the combination of these 3 (body mass, daily traveled distance and population density) best predicted (had the smallest AIC) the size of the brain regions. We provide additional information about these exploratory analyses in the supplementary material, sections 2 and 3.

      • Brain size (vs. body size) should be used as a predictor in the models.

      We do not understand the theoretical reason for replacing body size by brain size in the models. Brain size is not a socio-ecological variable. And of course, that would be impossible for modeling brain size itself. Or is it that the reviewer suggests to use brain size as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size? But what is the theoretical basis for this?

      • It is not appropriate to compare the impact of different predictors using their coefficients if the variables were not scaled prior to analysis.

      We thank the Reviewer for this comment; however, standardized coefficients are not unproblematic because their calculations are based on the estimated standard-deviations of the variables which are likely to be affected by sampling (in effect more than the means). We note that the methods of standardized coefficients have attracted several criticisms in the literature (see the References section in https://en.wikipedia.org/wiki/Standardized_coefficient). Nevertheless, we now provide a table with these coefficients which makes an easy comparison for the present study. We also updated tables 1, 2 and 3 to include standardized beta values.

      Reviewer #1 (Recommendations For The Authors):

      N/A

      Reviewer #2 (Recommendations For The Authors):

      Contemporary developmental biology has showed that the brain of all mammals, including primates, develops out of a bauplan (or blueprint) made of several fundamental morphological units that have invariant topological relations across species (Nieuwenhuys and Puelles 2016).

      At some point in the discussion the authors acknowledge that:

      "Our aim here was clearly not to provide a clear identification of anatomical boundaries across brain regions in individual species, as others have done using much finer neuroanatomical methods. Such a fine neuroanatomical characterization appears impossible to carry on for a sample size of species compatible with PGLS".

      I do not think it would be impossible to carry such neuroanatomical characterization. It would take time and effort, but it is feasible. Such characterization, if performed within the framework of contemporary developmental biology, would allow for well-founded definition and delineation of cortical sectors across primate species, including lissencephalic ones, and would allow for meaningful homologies and interspecies comparisons.

      We do not see how our work would benefit from developmental biology at that point, because it is concerned with evolution, and these are very distinct biological phenomena. We do not understand the reviewer’s focus on lissencephalic species, because they are not so prevalent across primates, and it is unlikely that adding a couple of lissencephalic species will change much to the conclusions.

      Minor points:

      • Please, format references according to the instructions of the journal.

      Ok - done

      • The authors could use the same color code across Figures 1, 2, and 3.

      Ok – done

      • The authors say that group hunting "only occurs in a few primate species", but it also occurs in wolves, whales, and other mammalian species.

      We focus on primates here, these other species are irrelevant. Again, this is beside the point.

      Reviewer #3 (Recommendations For The Authors):

      My comments are organized by section below:

      Introduction:

      • Well written and thorough

      • The two questions presented towards the end of the intro are not clear and do not guide the structure of the methods/results sections. I believe one it would be more appropriate to ask if: 1) the relative proportions of the FP and DLPFC (relative to ROB) are consistent across primates; and 2) if the relative size of these region is best predicted by social and/ or ecological variables. Then, the results sections could be organized according to these questions (current results section 1 = 1; current results sections 2, 3, 4 = 2.1, 2.2, 2.3)

      As explained above, we agree with the reviewer that the introduction was somehow misleading and we have edited it extensively. We do not, however, agree with the reviewer regarding the relative (vs absolute) measure. We have discussed this in our response to reviewer 1 regarding the comparison of regional volumes as proxies for number of neurons. The best predictor of the computing capacity of a brain region is its number of neurons, but there is no reason to believe that this capacity should decrease if the rest of the brain increases, as implied by the relative measure that the reviewer proposes. That debate is probably critical in the field of comparative neuroanatomy, and confronting different perspectives would surely be both interesting and insightful, but we feel that it is beyond the scope of the present article.

      Methods:

      • While the methods are straightforward and generally well described, it is unclear which combinations of models were compared or why only population density and distance travelled tested appear to have been included (in e.g., Fig SI 3.1) even though many more variables were collected.

      We agree that this was not clear enough, and we have tried to improve the description of our model comparison approach, both in the main text and in the supplementary material.

      • Why was body mass rather than ROB used as a predictor in the models? The authors should instead/also include analyses using ROB (so the analysis is of FP and DLPFC size relative to brain size). Using body mass confounds the analyses since they will be impacted by differences in brain size relative body size.


      Again, we have addressed this issue above. First, body size is a socio-ecological variable (if anything, it especially predicts energetic needs and energy expenditure), but ROB is clearly not. We do not see the theoretical relevance of ROB in a socio-ecological model. Second, from a neurobiological point of view, since within primates the volume of a given brain region is directly related to its number of neurons (again, see work of Herculano-Houzel), which is a good proxy for its computing capacity, we do not see the theoretical reason for considering ROB.

      • It is not appropriate to compare the impact of different predictors using their coefficients if the variables were not scaled prior to analysis. The authors need to implement this in their approach to make such claims.

      We thank the reviewer again for pointing that out. We have addressed this question above.

      • Differences across primates in terms of frontal lobe networks throughout the brain should be acknowledged (e.g., Barrett et al. 2020, J Neurosci).

      We have added that reference to the discussion, together with other references showing that the difference between human and non-human primates is significant, but essentially quantitative, rather than qualitative (the building blocks are relatively well conserved, but their relative weight differs a lot). Thank you for pointing it out.

      I hope the authors find my comments helpful in revising their manuscript.

      And we thank again the reviewer for the helpful and constructive comments.

    2. eLife assessment

      This valuable study correlates the size of various prefrontal brain regions in primate species with socioecological variables like foraging distance and population density. The evidence presented is solid but needs to be strengthened with additional analyses that demonstrate the robustness of their results. It is also unclear how this approach would work in other species that show variation in socioecological variables despite lacking clear anatomical markers to define brain areas.

    3. Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I agree with the authors that a focus on ecological vs laboratory variables is a good one, although it might have been useful to reflect that in the title.

      I am happy to see that the authors included additional analyses using different definitions of FP and DLPFC in the supplementary material. As I said in my earlier review, the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital.

      I am sorry the authors are so dismissive of the idea of looking the models where brain size and area size are directly compared in the model, rather preferring to run separate models on brain size and area size. This seems to me a sensible suggestion.

      Similarly, the debate about whether area volume and number of neurons can be equated across the regions is an important one, of which they are a bit dismissive.

      Nevertheless, I think this is an important study. I am happy that we are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is a important approach.

    4. Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience. But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

    5. Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis).

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This fundamental study identifies the homeodomain transcription factor and suspected autism-candidate gene Meis2 as transcriptional regulators of maturation and end-organ innervation of low-threshold mechanoreceptors (LTMRs) in the dorsal root ganglia (DRG) of mice. For a few years, the view on autism spectrum disorders (ASD) has shifted from a disorder that exclusively affects the brain to a condition that also includes the peripheral somatosensory system, even though our knowledge about the genes involved is incomplete. The study by Desiderio and colleagues is therefore not only scientifically interesting but may also have clinical relevance. The work is convincing, with appropriate and validated methodology in line with current state-of-the-art and the findings contribute both to understanding and potential application.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work examined transcription factor Meis2 in the development of mouse and chick DRG neurons, using a combination of techniques, such as the generation of a new conditional mutant strain of Meis2, behavioral assays, in situ hybridization, transcriptomic study, immunohistochemistry, and electrophysiological (ex vivo skin-nerve preparation) recordings. The authors found that Meis2 was selectively expressed in A fiber LTMRs and that its disruption affects the A-LTMRs' end-organ innervation, transcriptome, electrophysiological properties, and light touch-sensation.

      Strengths:

      1) The authors utilized a well-designed mouse genetics strategy to generate a mouse model where the Meis2 is selectively ablated from pre- and post-mitotic mouse DRG neurons. They used a combination of readouts, such as in situ hybridization, immunhistochemistry, transcriptomic analysis, skin-nerve preparation, electrophysiological recordings, and behavioral assays to determine the role of Meis2 in mouse DRG afferents.

      2) They observed a similar preferential expression of Meis2 in large-diameter DRG neurons during development in chicken, suggesting evolutionarily conserved functions of this transcription factor.

      3) Conducted severe behavioral assays to probe the reduction of light-touch sensitivity in mouse glabrous and hairy skin. Their behavioral findings support the idea that the function of Meis2 is essential for the development and/or maturation of LTMRs.

      4) RNAseq data provide potential molecular pathways through which Meis2 regulates embryonic target-field innervation.

      5) Well-performed electrophysiological study using skin-nerve preparation and recordings from saphenous and tibial nerves to investigate physiological deficits of Meis2 mutant sensory afferents.

      6) Nice whole-mount IHC of the hair skin, convincingly showing morphological deficits of Meis2 mutant SA- and RA- LTMRs.

      Overall, this manuscript is well-written. The experimental design and data quality are good, and the conclusion from the experimental results is logical.

      Weaknesses:

      1) Although the authors justify this study for the involvement of Meis2 in Autism and Autism associated disorders, no experiments really investigated Autism-like specific behavior in the Meis2 ablated mice.

      Indeed, in the first version of the manuscript, we use current understanding of ASD in mouse models and associated sensory defects to articulate our introduction and discussion. As noticed by reviewer 1, none of our experiments really investigated ASD. To avoid over-interpretation of the data, we have now removed sentences mentioning ASD and related references throughout the manuscript.

      2) For mechanical force sensing-related behavioral assays, the authors performed VFH and dynamic cotton swabs for the glabrous skin, and sticky tape on the back (hairy skin) for the hairy skin. A few additional experiments involving glabrous skin plantar surfaces, such as stick tape or flow texture discrimination, would make the conclusion stronger.

      We fully agree on that performing more behavioral analysis investigating with more details the primary sensory defects as well as some ASD-related behavior would re-inforce our conclusions. Our behavioral analysis clearly showed a loss of sensitivity in response to mechanical stimuli within the light touch range but not for higher range mechanical or noxious thermal stimuli. While the experiments suggested by the reviewer are interesting and would strengthen our conclusions, they are far from trivial and require large cohorts. Given the current laboratory conditions as stated at the outset, these unfortunately are not within reach.

      3) The authors considered von Frey filaments (1 and 1.4 g) as noxious mechanical stimuli (Figure 1E and statement on lines 181-183), which is questionable. Alligator clips or pinpricks are more certain to activate mechanical nociceptors.

      To avoid misinterpretation of the higher Von Frey filament tests, we deleted the two following statement in page 7: “In the von Frey test, the thresholds for paw withdrawal were similar between all genotypes when using filaments exerting forces ranging from 1 to 1.4g, which likely reflects the activation of mechanical nociception suggesting that Meis2 gene inactivation did not affect nociceptor function.”. The sentence “… while sparing other somatosensory behaviors” was also deleted.

      4) There are disconnections and inconsistencies among findings from morphological characterization, physiological recordings, and behavior assays. For example, Meis2 mutant SA-LTMRs show a deficiency in Merkel cell innervation in the glabrous skin but not in hairy skin. With no clear justification, the authors pooled recordings of SA-LTMRs from both glabrous and hairy skin and found a significant increase in mean vibration threshold. Will the results be significantly different if the data are analyzed separately? In addition, whole-mount IHC of Meissner's corpuscles showed morphological changes, but electrophysiological recordings didn't find significant alternation of RAI LTMRs. What does the morphological change mean then? Since the authors found that Meis2 mice are less sensitive to a dynamic cotton swab, which is usually considered as an RA-LTMR mediated behavior, is the SAI-LTMR deficit here responsible for this behavior? Connections among results from different methods are not clear, and the inconsistency should be discussed.

      We thank Reviewer 1 for the careful review of our data and fully agree with the weaknesses identified, weaknesses we were ourselves aware of at the time of submission. In particular on the lack of stronger connections between histological and electrophysiological data. Electrophysiological studies were conducted on a first cohort of mice where we mostly emphasize on WT and Meis2 mutant mice. The goal was to describe differences in electrophysiological properties of identified mechanoreceptors from these two genotypes. While substantial differences between WT and Islet1-Cre mice were not expected, only very few mice with this genotype were examined at that time to confirm this assumption. We fully agree with reviewer 1 that confirming differences in SA-LTMRs responses in the hairy and glabrous at electrophysiological levels would be interesting and worthwhile. It is assumed that the physiological properties of SA-LTMRs from glabrous and hairy skins are equivalent in both skin types. Indeed direct comparisons have been made between glabrous and hairy skin SA-LTMRs revealing that they have equivalent receptor properties (see Walcher et al J Physiol quoted in the manuscript). We had not recorded from a sufficient number of hairy and glabrous skin SA-LTMRs to make any meaningful comparison statistically. When we noticed the dramatic differences in the innervation patterns of Merkel cell complexes between glabrous and hairy skin, we immediately planned a second mice cohort, but as explained in the onset to the Public Review, this cohort was sacrificed due to the pandemic lockdown. However, the obtained dataset clearly shows that in Meis2 mutant mice many SA-LTMRs had similar vibration thresholds to those of wild types.

      For Meissner corpuscle, histological analysis evidenced clear morphological differences that could of course be investigated at the level of the dual innervation previously reported by Neubarth et al. It is uncertain whether differences in their electrophysiological responses would be revealed by increasing the number of recorded fibers. For this reason, we clearly stated this limitation in the results section page 7 “There was a tendency for RA-LTMRs in Isl1Cre/+::Meis2LoxP/LoxP mutant mice to fire fewer action potentials to sinusoids and to the ramp phase of a series 2 second duration ramp and hold stimuli, but these differences were not statistically significant (Figure 5B). Nevertheless it is important to point out that an electrical search strategy revealed that many Aβ-fibers did not have mechanosensitive receptive fields. Thus by focusing on LTMRs with a mechanosensitive receptive field, we ignore the fact that fewer fibers are mechanosensitive. This is now more extensively discussed in the discussion section of the manuscript page 13:

      “Indeed, the electrophysiology methods used here can only identify sensory afferents that have a mechanosensitive receptive field. Primary afferents that have an axon in the skin but no mechanosensitvity can only be identified with a so-called electrical search protocol (45, 46) which was not used here. It is therefore quite likely that many primary afferents that failed to form endings would not be recorded in these experiments e.g. SA-LTMRs and RA-LTMRs that fail to innervate end-organs (Fig.4-6).”

      “From our data, we could not conclude whether SA-LTMR electrophysiological responses are differentially affected in the glabrous versus hairy skin of Meis2 mutant as suggested by histological analysis. Further electrophysiological analysis focused on SA-LTMR selectively innervating the glabrous or hairy skin would be necessary to answer this question. Similarly, the decreased sensitivity of Meis2 mutant mice in the cotton swab assay and the morphological defects of Meissner corpuscles evidenced in histological analysis do not correlate with RA-LTMR electrophysiological responses for which a tendency to decreased responses were however measured. The later might result from an insufficient number of fibers recording, whereas the first may be due of pooling SA-LTMR from both the hairy and glabrous skin.”.

      Reviewer #2 (Public Review):

      Summary:

      Desiderio and colleagues investigated the role of the TALE (three amino acid loop extension) homeodomain transcription factor Meis2 during maturation and target innervation of mechanoreceptors and their sensation to touch. They start with a series of careful in situ hybridizations to examine Meis2 transcript expression in mouse and chick DRGs of different embryonic stages. By this approach, they identify Meis2+ neurons as slowly- and rapidly adapting A-beta LTMRs, respectively. Retrograde tracing experiments in newborn mice confirmed that Meis2-expressing sensory neurons project to the skin, while unilateral limb bud ablations in chick embryos in Ovo showed that these neurons require target-derived signals for survival. The authors further generated a conditional knock-out (cKO) mouse model in which Meis2 is selectively lost in Islet1-expressing, postmitotic neurons in the DRG (IsletCre/+::Meis2flox/flox, abbreviated below as cKO). WT and Islet1Cre/+ littermates served as controls. cKO mice did not exhibit any obvious alteration in volume or cellular composition of the DRGs but showed significantly reduced sensitivity to touch stimuli and various innervation defects to different end-organ targets. RNA-sequencing experiments of E18.5 DRGs taken from WT, Islet1Cre/+, and cKO mice reveal extensive gene expression differences between cKO cells and the two controls, including synaptic proteins and components of the GABAergic signaling system. Gene expression also differed considerably between WT and heterozygous Islet1Cre/+ mice while several of the other parameters tested did not. These findings suggest that Islet1 heterozygosity affects gene expression in sensory neurons but not sensory neuron functionality. However, only some of the parameters tested were assessed for all three genotypes. Histological analysis and electrophysiological recordings shed light on the physiological defects resulting from the loss of Meis2. By immunohistochemical approaches, the authors describe distinct innervation defects in glabrous and hairy skin (reduced innervation of Merkel cells by SA1-LTMRs in glabrous but not hairy skin, reduced complexity of A-beta RA1-LTMs innervating Meissner's corpuscles in glabrous skin, reduced branching and innervation of A-betA RA1-LTMRs in hairy skin). Electrophysiological recordings from ex vivo skin nerve preparations found that several, but not all of these histological defects are matched by altered responses to external stimuli, indicating that compensation may play a considerable role in this system.

      Strengths:

      This is a well-conducted study that combines different experimental approaches to convincingly show that the transcription factor Meis2 plays an important role in the perception of light touch. The authors describe a new mouse model for compromised touch sensation and identify a number of genes whose expression depends on Meis2 in mouse DRGs. Given that dysbalanced MEIS2 expression in humans has been linked to autism and that autism seems to involve an inappropriate response to light touch, the present study makes a novel and important link between this gene and ASD.

      Weaknesses:

      The authors make use of different experimental approaches to investigate the role of Meis2 in touch sensation, but the results obtained by these techniques could be connected better. For instance, the authors identify several genes involved in synapse formation, synaptic transmission, neuronal projections, or axon and dendrite maturation that are up- or downregulated upon targeted Meis2 deletion, but it is unresolved whether these chances can in any way explain the histological, electrophysiological, or behavioral deficits observed in cKO animals. The use of two different controls (WT and Islet1Cre/+) is unsatisfactory and it is not clear why some parameters were studied in all three genotypes (WT, Islet1Cre/+ and cKO) and others only in WT and cKO. In addition, Meis2 mutant mice apparently are less responsive to touch, whereas in humans, mutation or genomic deletion involving the MEIS2 gene locus is associated with ASD, a condition that, if anything, is associated with an elevated sensitivity to touch. It would be interesting to know how the authors reconcile these two findings. A minor weakness, the first manuscript suffers from some ambiguities and errors, but these can be easily corrected.

      We thank the reviewer for the insightful comments and suggestions.

      The use of two different controls (WT and Islet1Cre/+) is unsatisfactory and it is not clear why some parameters were studied in all three genotypes (WT, Islet1Cre/+ and cKO) and others only in WT and cKO.

      First, we identified a labelling mistake in figures 4D, 5A and 6A where the control shown are from Islet1+/Cre mice and not from WT as reported in the first version. We apologize for this mistake which has now been corrected. This typographical error does not in any way affect our conclusion, on the contrary, it shows that innervation defects are not the consequence of Islet1 heterozygosity.

      The reviewer wonders why for some data both control genotypes are presented, and for some others only one is presented. It is quite possible that genes expression changes happen due to a synergistic effect of both heterozygous Meis2 deletion and heterozygous Islet1 deletion. However, we found no evidence that this led to defects in target-field innervation or to changes in the physiological properties of sensory neurons.

      Whereas it could be fairly envisaged that some gene expression is modified due to a synergistic effect of both heterozygous Meis2 deletion and heterozygous deletion of Islet1, several lines of evidence support that the defects in target-field innervation and electrophysiological responses are exclusively due to Meis2 deletion. Previous work on Islet1 specific deletion in DRG sensory neurons opens the possibility that some of the phenotypes we report here are in part due to an effect of Islet1 heterozygous deletion or a synergistic effect to Meis2 homozygous deletion.

      1) When Islet1 is conditionally deleted in mice using the Wnt1-Cre strain or at later stages using a tamoxifen inducible-Cre, homozygous pups die a few hours after birth. Early Islet1 deletion results in an increased apoptosis in the DRG, a massive loss of DRG sensory neurons and sensory defects associated to nociceptors mostly and some touch neurons while proprioceptive neurons are spared (Sun et al., 2008 now included in the revised version of the manuscript). There was a decrease in the number of Ntrk1+ and Ntrk2+ neurons whereas Ntrk3+ neurons number appeared normal. When Islet1 is inactivated later in development, the number of Ntrk1+ and Ntrk2+ neurons were normal and only the expression of nociceptor specific markers was decreased. Since neither the DRG volume, nor the number of Ntrk1+, Ntrk2+ and Ntrk3+ neurons are changed in Meis2 cKO using the Islet1-Cre strain, an early significant effect of Islet1 heterozygous deletion is very unlikely.

      2) For distal innervation defects, it is clear from the Wnt1-Cre::Meis2 data (Figure 3E) that the distal innervation phenotype occurred while Meis2 is inactivated independently of Islet1 expression.

      3) Finally, the lack of differences between WT and Islet+/Cre mice in behavioral assays and in electrophysiological characterization of RA-LTMR of the hairy skin (Figure 6C) and SA-LTMR (Figure 4B and C) argues for a lack of significant consequences of Islet1 heterozygous deletion on these parameters.

      4) For bulk RNAseq studies, all datasets has been now re-analyzed following Reviewer 2 specific comments (see below). To avoid misinterpretation of the data, the results are now presented differently (see pages 8 and 9) and more critically discussed (see pages 14 and 15). In particular, we included and discuss references on Islet1 cKO mice.

      We also agree with reviewer 2 that our RNAseq study only provides cues on potential genes expression that could impact distal innervation and electrophysiological responses. However, proving which of those genes are fully responsible for the morphological and electrophysiological defects would require extensive mouse genetic investigations such as restoring their normal expression level in a Meis2 mutant context, which is beyond the scope of the present study.

      Finally, the reviewer questioned how we could reconcile the lower touch sensitivity in Meis2 mutant mice with the exacerbated touch sensitivity found in ASD patient and mouse models of ASD. As suggested by reviewer 1, our study did not really investigate ASD specifically. Therefore, to avoid over interpretation of the data and to follow Reviewer 1 recommendation, we have removed all references to ASD in the revised version of the manuscript. Indeed, to our knowledge, none of the case reports on Meis2 mutant patients investigated sensory function in general and light touch in particular, maybe because of the severe intellectual disability characterizing these patients.

      Reviewer #1 (Recommendations For The Authors):

      In addition to the aforesaid suggestions in the section 2, there are some minor issues:

      We thank the reviewer for the careful reading and for identifying all these typos. All of them have been corrected in the revised version of the manuscript.

      1) There should not be a full stop mark in the title of the article. This has been corrected in the new version of the manuscript.

      2) Figure 1C, 1D, please correct the typo "controlateral' to "contralateral".

      This has been corrected in the new version of the manuscript.

      3) Figure 1D, lower graph, Y-axis, please correct the typo 'umber' to "number".

      This has been corrected in the new version of the manuscript.

      4) To make it easy for readers, add the names of the behavioral tests on top of the graphs in Fig 1E-H.

      The name of behavioral tests is now added to the figure.

      5) It would be easier to read the markers' names in IHC and ISH images if they were written outside of image panels. The blue staining color in image 1B could be easily mixed with the background. Suggest change colors.

      Markers for IHC and IH images are now written outside the image panel or colors have been change in figure 1 and 2 for better clarity.

      6) The font size of Genes' name in Figure 3B is too small and not readable.

      Figure 3 has now been changed following Reviewer 2 recommendation. The small font size in Figure 3B is no longer present in the figure.

      7) Quantification of Fig 3E (number of fibers innervating each dermal papilla or footpad, for example).

      Unfortunately, we did not kept the Wnt1Cre::Meis2LoxP/LoxP strain which prevents further analysis (see onset of the answer to public review).

      8) In Figure 4, please arrange IHC images and their quantification results adjacent to each other.

      The figure has been reorganized and changes in the result section and figures legend were made accordingly.

      9) For consistency, please use either LTMR or LTM (See Figure 4F, 5A, 6C), but not both.

      This has been homogenized throughout the manuscript.

      10) Add arrows/heads to mark the overlaps in Figure 4D.

      Arrows are now added in Figure 4D to point at the overlap between Nefh and CK8 staining.

      11) Figure 5A, 6A, Lines 236, 240, 247, 258, 305, 308, 313, 347, and many more in Figure legends: please check in entire manuscript and make the mouse genotype nomenclature (+/Cre?) consistent. In some places, Cre is written in all upper case (Line 657).

      This has been homogenized throughout the manuscript.

      12) Figure 4G: Histogram color could be darker for better contrast.

      The color of the histograms has been changes in figures 6 and 5 for better clarity.

      13) Please add the figure number to the Figure 6.

      The figure number is now indicated on the figure.

      1. Figure 6B: Y-axis typo, correct "Nfeh" to Nefh.

      This typo is now corrected.

      15) Either explain Figure 2B information before that of Figure 2C (In lines 204-207) in the text or change the figure panel sequence to keep the consistent flow of contents.

      The figure has been modified and the panel sequence now follows that of the main text.

      16) Line 213 has a typo: change "form" to "from".

      This typo is now corrected.

      17) Line 423 has a typo. Correct "al" to "all".

      This typo is now corrected.

      18) Line 625 has a typo. Correct "fo" to "of".

      This typo is now corrected.

      19) Line 669 has a typo. Correct "Alexa Fluo" to "Fluor".

      This typo is now corrected.

      20) Line 744: To be consistent in the entire manuscript, write "Nfh" as "Nefh".

      This typo is now corrected.

      21) 740-749: Please add host names for all primary antibodies, as some are given but some are not for the current version.

      We now indicated the host species for all primary antibodies used in the study.

      22) Line 751 has a typo: change "a" to "as".

      This typo is now corrected.

      23) Line 754: what is for 20'?

      This typo is now corrected.

      24) Line 832: change "day test" to "testing day".

      The change has been made.

      25) Please mention for how many seconds the VFH was administered on the plantar surface in the method.

      A new sentence has been added to the “Von Frey withdrawal test” Methods section (page 30): “During each application, bend filament was maintained for approximately four to five seconds”.

      26) For the sticky tape test, in lieu of hind paw attending bouts, wet-dog shake behavior, the authors also found some scratching behaviors. Did they separately quantify these behaviors? It would be interesting to see exactly which behavior significantly reduced after Meis2 inactivation.

      Unfortunately, at the time of the design of the sticky tape test, we did not consider separating the behaviors considered as “positive” reactions. As these experiments were not video recorded, we are not able to extract this kind of information without generating new mice cohort and repeating this experiment.

      27) Line 344-345: consider rephrasing the sentence.

      This sentence has been removed.

      Reviewer #2 (Recommendations For The Authors):

      This is a beautiful and well-conducted study with all the strengths listed in the paragraphs above. Nevertheless, there are still some open questions, ambiguities in the presentation, and minor errors that I would recommend addressing.

      Major Points:

      1) The authors performed RNA-seq analysis from E18.5 mouse total DEGs from three different genotypes, WT, Isle1Cre/+ and cKO. Although this approach identified several interesting Meis2-dependent candidate genes, the presentation of the results is confusing, and the publication would gain impact if the RNA-seq results were better connected to the histological, behavioral, and electrophysiological data. Specific concerns:

      1.1) The gene expression profiles of WT and Islet1Cre/+ samples are remarkably divergent. According to Yang Development 2006, Islet1-Cre was generated by knocking in Cre into the endogenous Islet1 locus and replacing the Isl1 ATG, hence resulting in a heterozygous null for Islet1. When purely technical derivations can be excluded, the RNAseq results presented here suggest that heterozygous loss of Islet1 causes considerable gene expression changes in the postnatal DRG. For analysis of the RNAseq results, the authors focus on genes that are differentially expressed between one experimental condition (Islet1Cre/+::Meis2flox/flox) and either one of two controls (WT or Islet1Cre/+). Hence, they pool the genes that are differently expressed between cKO and Islet1Cre/+ with the genes that are different between cKO and WT. This approach mixes gene expression differences that result from two different genetic alterations, heterozygosity of Islet1 and targeted deletion of Meis2, respectively. It seems much more logical to compare the results pairwise.

      We agree with reviewer 2 that heterozygous deletion of Islet1 causes a significant change in genes expression that seems to very little correlate with any of the phenotypes we investigated in the study. When Islet1 is conditionally deleted in mouse using the Wnt1-cre strain, pups die few hours after birth and display increased apoptosis in the DRG, massive loss of DRG sensory neurons and sensory defects associated to nociceptors mostly and some touch neurons while proprioceptive neurons are spared (Sun et al., 2008 now included in the revised version of the manuscript). There is a decrease numbers of Ntrk1+ and Ntrk2+ neurons whereas the numbers of Ntrk3+ neurons appear normal. Later Isl1 inactivation does not induces changes in number of neurons and does not change Ntrk1 and 2 expressions. As explained in the answer to public reviews, bulk RNAseq data have now been reanalyzed following the reviewer suggestions and presented accordingly in the related figures.

      In the study bay Sun et al. they also reported DEGs following Islet1 homozygous deletion, but data on Islet1 heterozygous deletion are not included. However, out of the 60 most dysregulated genes identified in their study, only 6 were differentially expressed in our datasets. Importantly, DEGs in their studies where identified using microarray. In another study, the same group, showed that Brn3a (another transcription factor important for DRG neurons differentiation) and Islet1 exhibit negative epistasis on sensory genes expression (Dykes et al., 2011 now included in the revised version of the manuscript). Thus we cannot rule out that similar rules apply for Islet1 and Meis2. However, given the high diversity of DRG sensory neurons, interpreting our bulk RNAseq analysis in such direction might lead to misinterpretation.

      1.2) Along the same line, gene expression changes in Islet1Cre/+ DRGs seem to have little functional consequences, at least in the cases where all three genotypes were analyzed (target dependency (Fig. 1E), behavior (Fig. 1F), innervation (Fig. 4F, 6C)). Why were some parameters measured in all three genotypes and others only for WT and cKO? The authors probably reason that parameters that do not differ between WT and cKO animals will likely also not differ between WT and Islet1Cre/+. But what about parameters that do differ? Considering that the innervation of Merkel cells (Fig. 4E) and Meissner corpuscles (Fig. 5A) differ profoundly between WT and cKO, it would be interesting to know what this innervation looks like in Islet1Cre/+ DRGs. NEFH staining together with CK8 or S100beta from existing tissue sections should easily answer this question.

      As explained in the answer for public reviews, there was a mistake in the annotation of the control in figure 4 D and E, and in Fig. 5 that has now been corrected. Concerning target-dependency, those are experiments conducted in chick embryo, and therefore no associated genotype.

      1.3) Was a minimum cut-off for gene expression applied? The up-and downregulated genes in Fig. 3B list a number of pseudogenes and predicted genes. A quick (and incomplete) check for their expression in Fig2 Supple Table 1 shows that only a few reads were detected for most of them. With such low expression, even small changes will show up as significant differences.

      In our first analysis, a cut-off of 10 reads was applied. As reviewer 2 mentioned, this cut-off included several pseudogenes and predicted genes with low expression for which small changes were significant. We now re-analyzed the dataset using a cut-off of 100 reads. This excluded most of the previous predicted genes and pseudogenes for the analysis and resulted in a much small number of DEGs for each dataset. As recommended by reviewer 2, we also now performed the David analysis separately. These results are now presented in Figure 3 and corresponding supplementary figures.

      1.4) Given that bulk RNAseq from whole embryonic DRGs was performed, it would be interesting to know what cell type(s) express the Meis2-dependent transcripts. To address this question, the authors resort to published scRNAseq data by Usoskin Nat Neurosci 2015. They correlate the expression of all 488 DEGs (different between cKO and either WT or Islet1Cre/+) with the expression of Meis2 in the sensory neuron subtypes that were classified in the Usoskin paper. From that they conclude that many Meis2-dependent genes were expressed in the same sensory neuron classes as Meis2 itself. This is not apparent from Fig. 3 Supplementary 2. Neither do the 488 DEGs seem to be in any way enriched in the MEIS2-expressing cell clusters NF2/3/4/5, nor is cluster PEP1 particularly high in Meis2 expression. Immunostaining for MEIS2 together with a few selected DEGs would be a better way to assess co-expression.

      We agree with reviewer 2 that the correlation between DEGs and the expression of Meis2 in the sensory neuron subtypes was far from striking. In our opinion, the new analysis shows now a more robust correlation. However, it has to be kept in mind that among DEGs not all are expected to be Meis2 direct target genes and therefore to be enriched in the same Meis2-expressing population. This also hold true for genes that could be de-repressed or induced following Meis2 inactivation. Finally, the scRNAseq by Usoskin et al was performed on adult sensory neurons whereas our bulk RNAseq was performed on E18.5 embryos. Thus, because gene expression in developing sensory neurons is well-known to be highly dynamic, it is not expected that the transcriptional signature of sensory neurons subclasses in E18.5 embryo perfectly matches the transcriptional signature of adult subclasses. Finally, we agree that immunostaining for Meis2 together with few selected DEGs would give a better answer on whether they co-localize or not, but our lack of experience with those antibodies together with the lack of financial support for the proposal precludes achieving this pertinent point.

      1.5) The authors identify Gabra1 and Gabra4 as upregulated and Gabrr1 as downregulated genes in MEIS2 cKO animals. Does this reflect a change in GABA-receptor subunit composition in LMTRs?

      This is an interesting point. First, in our new analysis, increasing the cut-off to 100 reads excluded Gabrr1 from the DEGs. Based on our results, we cannot conclude whereas Gabra1 and Gabra4 up-regulation reflects a change in GABA receptors composition. However, in the GEO term associated to Gabaergic synapse, whereas Gabra1 and Gabra4 were up-regulated the ionotropic glutamate receptor Grid1 was downregulated, rather claiming for an imbalanced GABA/Glutamate transmission. Finally, the increased GABAR expression in the LTMRs might be expected to increase pre-synaptic inhibition on the LTMR synapses onto target neurons in the dorsal horn, thus decreasing synaptic transmission from these neurons into spinal circuits.

      2) The authors assessed SA-LTMR innervating Merkel cells in glabrous and hairy skin by IFC staining for neurofilament H and electrophysiological recordings. Due to the small sample size, they pooled recordings, reasoning that nerves that do not successfully innervate Merkel cells (i.e. cKO glabrous skin) do not evoke electrophysiological responses following a touch stimulus.

      2.1) It is undoubtedly true that non-innervating nerves will likely not show electrophysiological responses. However, by pooling the recordings of SA-LTMRs from glabrous and hairy skin, the data obtained from the 20% successful recordings of SA-LTMRs from glabrous cKO skin (according to Fig. 4E, upper panel) will be overrepresented and hence lead to a systematic bias. How many recordings were made from the glabrous and hairy skin of each genotype? In case the number of recordings from cKO/glabrous skin is the limiting factor, does the observed difference in vibration threshold hold true when only recordings from hairy skin are compared?

      As explained in the text and in our answers to reviewer 1, data for hairy and glabrous SAMs where initially pooled as no differences between them were expected, and next planned electrophysiological experiments were compromised due to the Covid19 pandemic. We are sorry that at this point, we cannot provide additional experiments to clarify this important point. In addition, as mention

      3) From the IFC images shown in Fig. 6A, it is not clear how the authors quantified branch points and innervated hair follicles.

      Branch points correspond to every time a nerve split in 2 or more nerves. Innervated follicles correspond to follicles that are entangled by circumferential and/or lanceolate Nefh+ endings.

      4) The quality of the data is very high, but there are several ambiguities and errors in their presentation.

      We apologize for this mistake. Figure 1 Supplementary 1 that reports data from Cat walk analysis is now appropriately included in the files.

      4.2) Fig. 3A is confusing and the figure legend just repeats what is already said in the text. What do yellow, blue, and pink represent?

      Figure 3 is now fully remade. Legend is now better indicated in Figure 3A. We hope it is now more clear.

      4.3) What genotype do the black, grey, and white boxplots in Fig. 6C Fig. 3 Supplementary 1B correspond to?

      The legends were missing for Figure 6C and Figure 3 supplementary 1B. They are now appropriately included.

      4.4) Up- and downregulated genes are assigned differently in Fig. 3 and Fig. 3 Supplementary 2. The figure legend of Fig. 3 Suppl 2 lists panel B as up-regulated genes but the same genes are labeled down-regulated in Fig. 3.

      We apologize for this previous mistake. Figure 3 and corresponding supplementary figures have been redone in the new version.

      4.5) Fig. 3E would benefit from a more detailed description. One can easily appreciate that the neurofilament H staining in the cKO sample is different from that of the WT sample but what exactly can be seen here?

      We added the following sentence in the results section: “In WT newborn mice, numerous Nefh+ sensory fibers surround all dermal papillae of the hairy skin and footpad of the glabrous skin, whereas in Wnt1Cre::Meis2LoxP/LoxP littermates, very few Nefh+ sensory fibers are present and they poorly innervate the dermal papillae and footpads.“.

      4.6) The figure legend to Fig. 4A is unclear. Does the graph show the sum of all recordings performed? From the text, one would guess that the bars correspond to the cKO samples, but this is not specified. Do the controls correspond to WT, Islet1Cre/+ or a mixture of both? In addition, the graph in the lower panel is labeled % Ab fibers, the figure legend reads % of tap units among Ab fibers.

      The graphs show the number of tap units identified among all recorded Afibers. Numbers show the number of tap units over the number of recorded fibers. This as been now reformulated in the last version of the manuscript.

      4.7) The abbreviation SAM in figure legends 4F, G is not introduced.

      This is now indicated in the figure legend.

      4.8) Readers who are not familiar with the traces above the graphs in 4F and 4G will find a more detailed description helpful.

      This is now indicated in the figure legend.

      4.9) Lines 274-275: Does the statement "Finally, consistent with the lack of neuronal loss in Isl1Cre/+::Meis2LoxP/LoxP, the number of recorded fibers were identical in WT and Isl1Cre/+::Meis2LoxP/LoxP." refer to Fig. 4G? This is not specified in the text.

      These data were not included in the first version of the manuscript as we though they were not significantly informative. They just indicate the overall numbers of fibers that were recorded in electrophysiological experiments. The sentence has been now removed in the last version of the manuscript to avoid misunderstanding.

      4.10) There is no Fig. 6 supplementary 1.

      The typo is now corrected. The corresponding data were in fact in Figure 5 Supplementary 1.

      Minor points:

      • Gangfuß et al. report that a patient previously diagnosed with a range of neurological deficits including the diagnosis of severe infantile autism is heterozygous mutant for MEIS2. Although this study links MEIS2 gene function to ASD in the wider sense, adding a few additional references will make the link stronger. Examples are Shimojima et al., Hum Genome Var 2017 or Bae et al., Science 2022.

      These two references have been now included in the introduction section of the manuscript.

      • In some figures (e.g. Fig. 4) the numbering of the panels does not follow the order in which the respective data are mentioned in the text.

      Figure 4 is now re-organized so that panels follow the same order as in the results section.

    2. Joint Public Review:

      Summary:<br /> Desiderio and colleagues investigated the role of the TALE (three amino acid loop extension) homeodomain transcription factor Meis2 during maturation and target innervation of mechanoreceptors and their sensation to touch. They start with a series of careful in situ hybridizations and immunohistochemical analyses to examine Meis2 transcript expression and protein distribution in mouse and chick DRGs of different embryonic stages. By this approach, they identify Meis2+ neurons as slowly- and rapidly adapting A-beta LTMRs, respectively. Retrograde tracing experiments in newborn mice confirmed that Meis2-expressing sensory neurons project to the skin, while unilateral limb bud ablations in chick embryos in ovo showed that these neurons require target-derived signals for survival. The authors further generated a conditional knock-out (cKO) mouse model in which Meis2 is selectively lost in Islet1-expressing, postmitotic neurons in the DRG (IsletCre/+::Meis2flox/flox, abbreviated below as cKO). WT and Islet1Cre/+ littermates served as controls. cKO mice did not exhibit any obvious alteration in volume or cellular composition of the DRGs but showed significantly reduced sensitivity to touch stimuli and various innervation defects to different end-organ targets. RNA-sequencing experiments of E18.5 DRGs taken from WT, Islet1Cre/+ and cKO mice reveals extensive gene expression differences between cKO cells and the two controls, including synaptic proteins and components of GABAergic- and glutamatergic transmission. Histological analysis and electrophysiological recordings shed light on the physiological defects resulting from the loss of Meis2. By immunohistochemical approaches, the authors describe distinct innervation defects in glabrous and hairy skin (reduced innervation of Merkel cells by SA1-LTMRs in glabrous but not hairy skin, reduced complexity of A-beta RA1-LTMs innervating Meissner's corpuscles in glabrous skin, reduced branching and innervation of A-betA RA1-LTMRs in hairy skin). Electrophysiological recordings from ex vivo skin nerve preparations found that several, but not all of these histological defects are matched by altered responses to external stimuli, indicating that compensation may play a considerable role in this system. This study will be of interest to developmental biologists and neuroscientists, in particular those interested in the sensation of touch.

      Strengths:<br /> This is a well-conducted study that combines different experimental approaches to convincingly show that the transcription factor Meis2 plays an important role in the perception of light touch. The authors describe a new mouse model for compromised touch sensation, characterize it by histology and electrophysiological recordings, and identify several genes whose expression depends on Meis2 in mouse DRGs.

      Weaknesses:<br /> The authors use different experimental approaches to investigate the role of Meis2 in touch sensation, but the results obtained by these techniques could be better connected. For instance, the authors identify several genes involved in synapse formation, synaptic transmission, neuronal projections, or axon and dendrite maturation that are up- or downregulated upon targeted Meis2 deletion, but it remains to be resolved whether these chances explain the histological, electrophysiological, or behavioral deficits observed in cKO animals.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript provides a fundamental contribution to the understanding of the role of intrinsically disordered proteins in circadian clocks and the potential involvement of phase separation mechanisms. The authors convincingly report on the structural and biochemical aspects and the molecular interactions of the intrinsically disordered protein FRQ. This paper will be of interest to scientists focusing on circadian clock regulation, liquid-liquid phase separation, and phosphorylation.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      "Phosphorylation, disorder, and phase separation govern the behavior of Frequency in the fungal circadian clock" is a convincing manuscript that delves into the structural and biochemical aspects of FRQ and the FFC under both LLPS and non-LLPS conditions. Circadian clocks serve as adaptations to the daily rhythms of sunlight, providing a reliable internal representation of local time.

      All circadian clocks are composed of positive and negative components. The FFC contributes negative feedback to the Neurospora circadian oscillator. It consists of FRQ, CK1, and FRH. The FFC facilitates close interaction between CK1 and the WCC, with CK1-mediated phosphorylation disrupting WCC:c-box interactions necessary for restarting the circadian cycle.

      Despite the significance of FRQ and the FFC, challenges associated with purifying and stabilizing FRQ have hindered in vitro studies. Here, researchers successfully developed a protocol for purifying recombinant FRQ expressed in E. coli.

      Armed with full-length FRQ, they utilized spin-labeled FRQ, CK1, and FRH to gain structural insights into FRQ and the FFC using ESR. These studies revealed a somewhat ordered core and a disordered periphery in FRQ, consistent with prior investigations using limited proteolysis assays. Additionally, p-FRQ exhibited greater conformational flexibility than np-FRQ, and CK1 and FRH were found in close proximity within the FFC. The study further demonstrated that under LLPS conditions in vitro, FRQ undergoes phase separation, encapsulating FRH and CK1 within LLPS droplets, ultimately diminishing CK1 activity within the FFC. Intriguingly, higher temperatures enhanced LLPS formation, suggesting a potential role of LLPS in the fungal clock's temperature compensation mechanism.

      Biological significance was supported by live imaging of Neurospora, revealing FRQ foci at the periphery of nuclei consistent with LLPS. The amino acid sequence of FRQ conferred LLPS properties, and a comparison of clock repressor protein sequences in other eukaryotes indicated that LLPS formation might be a conserved process within the negative arms of these circadian clocks.

      In summary, this manuscript represents a valuable advancement with solid evidence in the understanding of a circadian clock system that has proven challenging to characterize structurally due to obstacles linked to FRQ purification and stability. The implications of LLPS formation in the negative arm of other eukaryotic clocks and its role in temperature compensation are highly intriguing.

      Strengths:

      The strengths of the manuscript include the scientific rigor of the experiments, the importance of the topic to the field of chronobiology, and new mechanistic insights obtained.

      Weaknesses:

      This reviewer had questions regarding some of the conclusions reached.

      Recommendations For The Authors:

      The reviewer has a few questions for the authors:

      1) Concerning the reduced activity of sequestered CK1 within LLPS droplets with FRQ, to what extent is this decrease attributed to distinct buffer conditions for LLPS formation compared to non-LLPS conditions?

      We don’t believe that these buffer conditions significantly influence the change in FRQ phosphorylation by CK1 observed at elevated temperatures. The pH and ionic strength of the buffer are in keeping with physiological conditions (300 mM NaCl, 50 mM sodium phosphate, 10 mM MgCl2, pH 7.5); CK1 autophosphorylation is robust and generally increases with temperature under these conditions (Figure 7B). However, as LLPS increases CK1 autophosphorylation remains high, whereas phosphorylation of FRQ dramatically decreases. In fact, we chose to alter temperature specifically to induce changes in phase behavior under constant buffer conditions. In this way LLPS could be increased, and FRQ phosphorylation evaluated, without altering the solution composition. Thus, we believe that the reduced CK1 kinase activity toward FRQ as a substrate is directly due to the impact of the generated LLPS milieu, i.e. the changes in structural/dynamic properties of FRQ and/or CK1 induced by the effects of being a phase separate microenvironment, which could be substantially different from non-phase separated buffer environment. For example, previous work done on the disordered region of DDX4 [Brady et al. 2017, and Nott et al. 2015] show that even the amount of water content and stability of biomolecules such as double strand nucleic acids encapsulated within the droplets differ between non- and phase separated DDX4 samples.

      Nott T.J. et al. Phase transition of a disordered nuage protein generates environmentally responsive membraneless organelles. Mol. Cell. 2015 57 936-947.

      Brady J.P. et al. Structural and hydrodynamic properties of an intrinsically disordered region of a germ cell-specific protein on phase separation. PNAS 2017 114 8194-8203.

      In the results section we have clarified the use of temperature to control LLPS, “We compared the phosphorylation of FRQ by CK1 in a buffer that supports phase separation under different temperatures, using the latter as a means to control the degree of LLPS without altering the solution composition.”

      On p.16 of the discussion we have elaborated on the above point, “We believe that the reduced CK1 kinase activity toward FRQ as a substrate is directly due to the impact of the generated LLPS milieu, i.e. the changes in structural/dynamic properties of FRQ and/or CK1 induced by the effects of being a phase separate microenvironment, which could be substantially different from non-phase separated buffer environment. For example, previous work done on the disordered region of DDX4 {Brady, 2017 #130;Nott, 2015 #131} show that even the amount of water content and stability of biomolecules such as double strand nucleic acids encapsulated within the droplets differ between non- and phase separated DDX4 samples. Indeed, the spin-labeling experiments indicate that the dynamics of FRQ have been altered by LLPS (Fig. 7D).”

      2) The DEER technique demonstrated spatial proximity between FRH and CK1 when bound to FRQ in the FFC. Is there evidence suggesting their lack of proximity in the absence of FRQ? Also, how important is this spatial proximity to FFC function?

      We have additional data substantiating that FRH and CK1 do not interact in the absence of FRQ. In the revised paper we have included the results of a SEC-MALS experiment showing that FRH and CK1 elute separately when mixed in equimolar amounts and applied to an analytical S200 column coupled to a MALS detector (Figure 1 below and Fig. S8). The importance of the FRH and CK1 proximity is currently unknown, but there are reasons to believe that it could have functional consequences. For example, CK1, as recruited by FRQ, phosphorylates the White-Collar Complex (WCC) in the repressive arm of the circadian oscillator [e.g. He et al. Genes Dev. 20, 2552 (2006); Wang et al, Mol. Cell 74, 771 (2019)]. Interactions between the WCC and the FFC are mediated at least in part by FRH binding to White Collar-2 [Conrad et al. EMBO J. 35, 1707 (2016)]. Thus, FRH:FRQ may effectively bridge CK1 to the WCC to facilitate the phosphorylation of the latter by the former.

      He et al. CKI and CKII mediate the FREQUENCY-dependent phosphorylation of the WHITE COLLAR complex to close the Neurospora circadian negative feedback loop. Genes Dev. 2006 20, 2552-2565.

      Wang B. et al. The Phospho-Code Determining Circadian Feedback Loop Closure and Output in Neurospora Mol. Cell 2019 74, 771-784.

      Conrad et al. Structure of the frequency-interacting RNA helicase: a protein interaction hub for the circadian clock. EMBO J. 2016 35, 1707-1719.

      Author response image 1.

      Size-exclusion chromatography- multiangle light scattering (SEC-MALS) of a mixture of purified FRH and CK1. The proteins elute separately as monomers with no evidence of co-migration.

      3) Is there any indication that impairing FRQ's ability to undergo LLPS disrupts clock function?

      We do not currently have direct evidence that LLPS of FRQ is essential for clock function. These experiments are ongoing, but complicated by the fact that changes to FRQ predicted to alter LLPS behavior also have the potential to perturb its many other clock-related functions that include dynamic interactions with partners, dynamic post-translational modification and rates of synthesis and degradation. That said, the intrinsic disorder of FRQ is important for it to act as a protein interaction hub, and large intrinsically disordered regions (IDRs) very often mediate LLPS, as is certainly the case here. In this work, we argue that the ability of FRQ to sequester clock proteins during the TTFL may involve LLPS. Additionally, we show that the phosphorylation state of FRQ, which is a critical factor in clock period determination, depends on LLPS. Given that the conditions under which FRQ phase separates are physiological in nature and that live-cell imaging is consistent with FRQ phase separation in the nucleus, it seems likely that FRQ does phase separate in Neurospora. Furthermore, given that the sequence features of FRQ that mediate phase-separation are conserved not only across FRQ homologs but also in other functionally related clock proteins, it is probable, albeit worthy of further investigation, that LLPS has functional consequences for the clock. See the response to reviewer 3 for more discussion on this topic.

      Minor Points:

      Indeed, we have included a reference to this paper on p. 3: “Emerging studies in plants (Jung, et al., 2020), flies (Xiao, et al., 2021) and cyanobacteria (Cohen, et al., 2014; Pattanayak, et al., 2020) implicate LLPS in circadian clocks, and in Neurospora it has recently been shown that the Period-2 (PRD-2) RNA-binding protein influences frq mRNA localization through a mechanism potentially mediated by LLPS (Bartholomai, et al., 2022).”

      • On page 9, six lines from the top, please insert "of" between "distributions" and "p-FRQ".

      We have corrected this typo.

      Reviewer #2 (Public Review):

      Summary:

      This study presents data from a broad range of methods (biochemical, EPR, SAXS, microscopy, etc.) on the large, disordered protein FRQ relevant to circadian clocks and its interaction partners FRH and CK1, providing novel and fundamental insight into oligomerization state, local dynamics, and overall structure as a function of phosphorylation and association. Liquid-liquid phase separation is observed. These findings have bearings on the mechanistic understanding of circadian clocks, and on functional aspects of disordered proteins in general.

      Strengths:

      This is a thorough work that is well presented. The data are of overall high quality given the difficulty of working with an intrinsically disordered protein, and the conclusions are sufficiently circumspect and qualitative to not overinterpret the mostly low-resolution data.

      Weaknesses:

      None

      Recommendations For The Authors:

      1)Fig.2B: Beyond the SEC part (absorbance vs elution volume), I don't understand this plot, in particular the horizontal lines. They appear to be correlating molecular weight with normalized absorption at 280 nm, but the chromatogram amplitudes are different. Clarify, or modify the plot. There are also some disconnected line segments between 10-11 mL - these seem to be spurious.

      We apologize for the confusion. The horizontal lines are meant to only denote the average molecular weights of the elution peaks and not correlate with the A280 values. The disconnected lines are the light-scattering molecular weight readouts from which the horizontal lines are derived. The problematic nature of the figure is that the full elution traces and MALS traces across the peaks call for different scales to best depict the relevant features of the data. We have reworked the figure and legend to make the key points more clear.

      2) It could be useful to add AF2 secondary structure predictions, pLDDT, and the helical propensity analysis to the sequence ribbon in Fig.1C.

      Thank you for the suggestion, we have updated the figure to incorporate the pLDDT scores into the linear sequence map, as well as the secondary structure predictions.

      3) Fig.3D: It would be better to show the raw data rather than the fits. At the same time, I appreciate the fact that the authors resisted the temptation to show distance distributions.

      Yes, we agree that it is important to show the raw data; it is included in the supplementary section. Depicting the raw data here unfortunately obscures the differences in the traces and we believe that showing the data as a superposition is quite useful to convey the main differences among the sites. However, we have now explicitly stated in the figure legend that the corresponding raw data traces are given in Figures S5-6.

      4) Fig.5: For all distance distributions, error intervals should be added (typically done in terms of shaded bands around the best-fit distribution). As shown, precision is visually overstated. The error analysis shown in the SI is dubious, as it shows some distances have no error whatsoever (e.g. 6nm in 370C-490C), which is not possible.

      We did previously show the error intervals in the SI, but we agree that it is better to include them here as well, and have done so in the new Figure 5. With respect to the error analysis, we are following the methodology described in the following paper:

      Srivastava, M. and Freed J., Singular Value Decomposition Method To Determine Distance Distributions in Pulsed Dipolar Electron Spin Resonance: II. Estimating Uncertainty. J. Phys Chem A (2019) 123:359-370. doi: 10.1021/acs.jpca.8b07673.

      Briefly, the uncertainty we are plotting is showing the "range" of singular values over which the singular value decomposition (SVD) solution remains converged. For most of the data displayed in this paper we only used the first few singular values (SVs) and the solution remained converged for ± 1 or 2 SVs near the optimum solution. For example, if the optimum solution was 4 SVs then the range in which the solution remained converged is ~3-6 SVs. We plot three lines - lowest range of SVs, highest range of SVs and optimum number of SVs – in the SI figures the optimum SV solution is shown in black and the region between the converged solutions with the highest and lowest number of SVs is shaded in red. Owing to the point-wise reconstruction of the distance distribution, the SVD method enables localized uncertainty at each distance value. Therefore, some points will have high uncertainty, whereas others low. The distance that may appear to have no uncertainty has actually very low uncertainty; which can be seen at close inspection. In these cases, we observe this "isosbestic" type behavior where the P(r) appears to change little across the acceptable solutions and hence there is only a small range of P(r) values at that particular r. This behavior results from multimodal distributions wherein the change in SVs shifts neighboring peaks to lower and higher distances respectively, producing an apparent cancelation effect. What we believe is most important for the biochemical interpretation, and accurately reflected by this analysis, is the general width of the uncertainty across the distribution and how this impacts the error in both the mean and the overall skewing of the distribution at short or long distances.

      Details of the error treatment as described above have been added to the supplementary methods section.

      5) The Discussion (p.13) states that the SAXS and DEER data show that disorder is greater than in a molten globule and smaller than in a denatured protein. Evidence to support this statement (molten globule DEER/SAXS reference data etc.) should be made explicit.

      We will make the statement more explicit by changing it to the following: “Notably, the shape of the Kratky plots generated from the SAXS data suggest a degree of disorder that is substantially greater than that expected of a molten globule (Kataoka, et al., 1997), but far from that of a completely denatured protein (Kikhney, et al., 2015; Martin, Erik W., et al., 2021). Similarly, the DEER distributions, though non-uniform across the various sites examined, indicate more disorder than that of a molten globule (Selmke et al., 2018) but more order than a completely unfolded protein (van Son et al. 2015).”

      van Son, M., et al. Double Electron−Electron Spin Resonance Tracks Flavodoxin Folding, J. Phys. Chem. B 2015, 119, 13507−13514. doi: 10.1021/acs.jpcb.5b00856.

      Selmke, B. et al. Open and Closed Form of Maltose Binding Protein in Its Native and Molten Globule State As Studied by Electron Paramagnetic Resonance Spectroscopy. Biochemistry 2018, 57, 5507−5512 doi: 10.1021/acs.biochem.8b00322.

      6) Fig. S11B could be promoted to the main paper.

      This comment makes a good point. Figure 8 is now an updated scheme, similar to the previous Fig. S11B. Thank you for the suggestion.

      Minor corrections:

      p.1: "composed from" -> "composed of"

      p.2: TFFLs -> TTFLs

      p.2: "and CK1 via" => "and to CK1 via"

      p.5: "Nickel" -> "nickel"

      p.5: "Size Exclusion Chromatography" -> "Size exclusion chromatography"

      p.5: "Multi Angle Light Scattering" -> "multi-angle light scattering"

      Fig.2 caption: "non-phosphorylated (np-FRQ)" -> "non-phosphorylated FRQ (np-FRQ)"

      Fig. S3: What are the units on the horizontal axis?

      Fig. 5H is too small

      Fig. S8, S9: all distance distribution plots show a spurious "1"

      Fig. 6A has font sizes that are too small to read

      p.11: "cytoplasm facing" -> "cytoplasm-facing"

      p.11: "temperature dependent" -> "temperature-dependent"

      p.12: "substrate-sequestration and product-release" -> "substrate sequestration and product release"

      p.12: "depend highly buffer composition" -> "depend highly on buffer composition"

      We thank the reviewer for finding these errors and their attention to detail. All of these minor points have been addressed in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript from Tariq and Maurici et al. presents important biochemical and biophysical data linking protein phosphorylation to phase separation behavior in the repressive arm of the Neurospora circadian clock. This is an important topic that contributes to what is likely a conceptual shift in the field. While I find the connection to the in vivo physiology of the clock to be still unclear, this can be a topic handled in future studies.

      Strengths:

      The ability to prepare purified versions of unphosphorylated FRQ and P-FRQ phosphorylated by CK-1 is a major advance that allowed the authors to characterize the role of phosphorylation in structural changes in FRQ and its impact on phase separation in vitro.

      Weaknesses:

      The major question that remains unanswered from my perspective is whether phase separation plays a key role in the feedback loop that sustains oscillation (for example by creating a nonlinear dependence on overall FRQ phosphorylation) or whether it has a distinct physiological role that is not required for sustained oscillation.

      The reviewer raises the key question regarding data suggesting LLPS and phase separated regions in circadian systems. To date condensates have been seen in cyanobacteria (Cohen et al, 2014, Pattanayak et al, 2020) where there are foci containing KaiA/C during the night, in Drosophila (Xiao et al, 2021) where PER and dCLK colocalize in nuclear foci near the periphery during the repressive phase, and in Neurospora (Bartholomai et al, 2022) where the RNA binding protein PRD-2 sequesters frq and ck1a transcripts in perinuclear phase separated regions. Because the proteins responsible for the phase separation in cyanobacteria and Drosophila are not known, it is not possible to seamlessly disrupt the separation to test its biological significance (Yuan et al, 2022), so only in Neurospora has it been possible to associate loss of phase separation with clock effects. There, loss of PRD-2, or mutation of its RNA-binding domains, results in a ~3 hr period lengthening as well as loss of perinuclear localization of frq transcripts. A very recent manuscript (Xie et al., 2024) calls into question both the importance and very existence of LLPS of clock proteins at least as regards to mammalian cells, noting that it may be an artefact of overexpression in some places where it is seen, and that at normal levels of expression there is no evidence for elevated levels at the nuclear periphery. Artefacts resulting from overexpression plainly cannot be a problem for our study nor for Xiao et al. 2021 as in both cases the relevant clock protein, FRQ or PER, was labeled at the endogenous locus and expressed under its native promoter. Also, it may be worth noting that although we called attention to enrichment of FRQ[NeonGreen] at the nuclear periphery, there remained abundant FRQ within the core of the nucleus in our live-cell imaging.

      Cohen SE, et al.: Dynamic localization of the cyanobacterial circadian clock proteins. Curr Biol 2014, 24:1836–1844, https://doi.org/10.1016/j.cub.2014.07.036.

      Pattanayak GK, et al.: Daily cycles of reversible protein condensation in cyanobacteria. Cell Rep 2020, 32:108032, https://doi.org/10.1016/j.celrep.2020.108032.

      Xiao Y, Yuan Y, Jimenez M, Soni N, Yadlapalli S: Clock proteins regulate spatiotemporal organization of clock genes to control circadian rhythms. Proc Natl Acad Sci U S A 2021, 118, https://doi.org/10.1073/pnas.2019756118.

      Bartholomai BM, Gladfelter AS, Loros JJ, Dunlap JC. 2022 PRD-2 mediates clock-regulated perinuclear localization of clock gene RNAs within the circadian cycle of Neurospora. Proc Natl Acad Sci U S A. 119(31):e2203078119. doi: 10.1073/pnas.2203078119.

      Yuan et al., Curr Biol 78: 102129, 2022. https://doi.org/10.1016/j.ceb.2022.102129

      Pancheng Xie, Xiaowen Xie, Congrong Ye, Kevin M. Dean, Isara Laothamatas , S K Tahajjul T Taufique, Joseph Takahashi, Shin Yamazaki, Ying Xu, and Yi Liu (2024). Mammalian circadian clock proteins form dynamic interacting microbodies distinct from phase separation. Proc. Nat. Acad. Sci. USA. In press.

      We have updated the discussion on p. 15 accordingly:

      “Live cell imaging of fluorescently-tagged FRQ proteins is consistent with FRQ phase separation in N. crassa nuclei. FRQ is plainly not homogenously dispersed within nuclei, and the concentrated foci observed at specific positions in the nuclei indicate condensate behavior similar to that observed for other phase separating proteins (Bartholomai, et al., 2022; Caragliano, et al., 2022; Gonzalez, A., et al., 2021; Tatavosian, et al., 2019; Xiao, et al., 2021). While ongoing experiments are exploring more deeply the spatiotemporal dynamics of FRQ condensates in nuclei, the small size of fungal nuclei as well as their rapid movement with cytoplasmic bulk flow through the hyphal syncytium makes these experiments difficult. Of particular interest is drawing comparisons between FRQ and the Drosophila Period protein, which has been observed in similar foci that change in size and subnuclear localization throughout the circadian cycle (Meyer, et al., 2006; Xiao, et al., 2021), although it must be noted that the foci we observed are considerably more dynamic in size and shape than those reported for PER in Drosophila (Xiao, et al., 2021). A very recent manuscript (Xie, et al., 2024) calls into question the importance and very existence of LLPS of clock proteins at least in regards to mammalian cells, noting that it may be an artifact of overexpression in some instances where it is seen, and that at normal levels of expression there is no evidence for elevated levels at the nuclear periphery. Artifacts resulting from overexpression are unlikely to be a problem for our study and that of Xiao et al as in both cases clock proteins were tagged at their endogenous locus and expressed from their native promoters. Although we noted enrichment of FRQmNeonGreen near the nuclear envelope in our live-cell imaging, there remained abundant FRQ within the core of the nucleus.”

      Recommendations For The Authors:

      The data in Fig 6 showing microscopy of Neurospora is suggestive but needs more information/controls. Does the strain that expresses FRQ-mNeonGreen have normal circadian rhythms? How were the cultures handled (in terms of circadian entrainment etc.) for imaging? Do samples taken at different clock times appear different in terms of punctate structures in microscopy? The authors cite the Xiao 2021 paper in Drosophila, but would be good to see if the in vivo picture is fundamentally similar in Neurospora.

      All of the live-cell images we report were from cells grown in constant light; in the dark, strains bearing FRQ[NeonGreen] have normally robust rhythms with a slightly elongated period length as measured by a frq Cbox-luc reporter. Although we are interested, of course, in whether and if so how the punctate structures changed as function of circadian time, this is work in progress and beyond the scope of the present study. This said, it is plain to see from the movie included as a Supplemental file here that the puncta we see are moving and fusing/splitting on a scale of seconds whereas those reported in Drosophila by Xiao et al. (Xiao et al, 2021, above) were stable for many minutes; thus the FRQ foci seen in Neurospora are quite a bit more dynamic than those in Drosophila.

      We have updated the results section on p. 11 to provide this information more clearly: “FRQ thus tagged and driven by its own promoter is expressed at physiologically normal levels, and strains bearing FRQmNeonGreen as the only source of FRQ are robustly rhythmic with a slightly longer than normal period length. Live-cell imaging in Neurospora crassa offers atypical challenges because the mycelia grow as syncytia, with continuous rapid nuclei motion during the time of imaging. This constant movement of nuclei is compounded by the very low intranuclear abundance of FRQ and the small size of fungal nuclei, making not readily feasible visualization of intranuclear droplet fission/fusion cycles or intranuclear fluorescent photobleaching recovery experiments (FRAP) that could report on liquid-like properties. Nonetheless, bright and dynamic foci-like spots were observed well inside the nucleus and near the nuclear periphery, which is delineated by the cytoplasm-facing nucleoporin Son-1 tagged with mApple at its C-terminus (Fig. 6D,E, Movie S1). Such foci are characteristic of phase separated IDPs (Bartholomai, et al., 2022; Caragliano, et al., 2022; Gonzalez, A., et al., 2021; Tatavosian, et al., 2019) and share similar patterning to that seen for clock proteins in Drosophila (Meyer, et al., 2006; Xiao, et al., 2021), although the foci we observed are substantially more dynamic than those reported in Drosophila.”

      Another issue where some commentary would be helpful: Fig 7 shows that phase separation behavior is strongly temperature dependent (not biophysically surprising). Is that at odds with the known temperature compensation of the circadian rhythm if LLPS indeed plays a key role in the oscillator?

      We believe that the dependence of CK1-mediated FRQ phosphorylation on temperature, as manifested by FRQ phase separation, is consistent with temperature compensation within the Neurospora circadian oscillator. The phenomenon of temperature compensation by circadian clocks involves the intransigence of the oscillator period to temperature change. Stability of period with temperature change would not necessarily be expected of a generic chemical oscillator, which would run faster (shorter period) at higher temperature owing to Arrhenius behavior of the underlying chemical reactions. Circadian phosphorylation of FRQ is one such chemical process that contributes to the oscillation of FRQ abundance on which the clock is based. Reduced CK1 phosphorylation of FRQ causes both longer periods [Mehra et al., 2009] and loss of temperature compensation (manifested as a reduction of period length at higher temperature) [Liu et al, Nat Comm, 10, 4352 (2019); Hu et al, mBio, 12, e01425 (2021)]. Thus, the ability of increased LLPS formation at elevated temperature to reduce FRQ phosphorylation by CK1 (but not intrinsic CK1 autophosphorylation) would be a means to counter a decreasing period length that would otherwise manifest in an under compensated system. As further negative feedback on the system, LLPS is also promoted by FRQ phosphorylation itself, which in turn will reduce phosphorylation by CK1. Thus, both increased FRQ phosphorylation and temperature will couple to increased LLPS and mitigate period shortening through reduction of CK1 activity.

      Mehra et al., A Role for Casein Kinase 2 in the Mechanism Underlying Circadian Temperature Compensation. May 15, 2009. Cell 137, 749–760,

      Liu et al. FRQ-CK1 interaction determines the period of circadian rhythms in Neurospora. Nat Comm. 2019, 10 4352.

      Hu et al FRQ-CK1 Interaction Underlies Temperature Compensation of the Neurospora Circadian Clock mBio 2021 12 WOS:000693451600006.

      We have added Figure 8 to clarify the interpretation of the temperature compensation implicaitons of our work, the legend of which reads:

      “Figure 8: LLPS may play a role in temperature compensation of the clock through modulation of FRQ phosphorylation. Reduced CK1 phosphorylation of FRQ causes both longer periods (Mehra, et al., 2009) and loss of temperature compensation (manifested as a shortening of period at higher temperature) (Hu, et al., 2021; Liu, X., et al., 2019). Thus, the ability of increased LLPS at elevated temperature (larger grey circle) to reduce FRQ phosphorylation by CK1 will counter a shortening period that would otherwise manifest in an under compensated system. As further negative feedback, LLPS is also promoted by increased FRQ phosphorylation, which in turn will reduce phosphorylation by CK1. Thus, both increased FRQ phosphorylation and temperature favor LLPS and reduction of CK1 activity.”

      one minor comment: The chemical structures in Fig 3A have some issues where the "N" and "S" are flipped. Would be good to remake these figures to fix this problem.

      We apologize, the figure has been replaced with an improved version.

    2. eLife assessment

      This manuscript is a fundamental contribution to the understanding of the role of intrinsically disordered proteins in circadian clocks and the potential involvement of phase separation mechanisms. The authors convincingly report on the structural and biochemical aspects and the molecular interactions of the intrinsically disordered protein FRQ. The paper will be of interest to scientists focusing on circadian clock regulation, liquid-liquid phase separation, and phosphorylation.

    3. Reviewer #1 (Public Review):

      Summary:<br /> "Phosphorylation, disorder, and phase separation govern the behavior of Frequency in the fungal circadian clock" is a convincing manuscript that delves into the structural and biochemical aspects of FRQ and the FFC under both LLPS and non-LLPS conditions. Circadian clocks serve as adaptations to the daily rhythms of sunlight, providing a reliable internal representation of local time.

      All circadian clocks are composed of positive and negative components. The FFC contributes negative feedback to the Neurospora circadian oscillator. It consists of FRQ, CK1, and FRH. The FFC facilitates close interaction between CK1 and the WCC, with CK1-mediated phosphorylation disrupting WCC:c-box interactions necessary for restarting the circadian cycle.

      Despite the significance of FRQ and the FFC, challenges associated with purifying and stabilizing FRQ have hindered in vitro studies. Here, researchers successfully developed a protocol for purifying recombinant FRQ expressed in E. coli.

      Armed with full-length FRQ, they utilized spin-labeled FRQ, CK1, and FRH to gain structural insights into FRQ and the FFC using ESR. These studies revealed a somewhat ordered core and a disordered periphery in FRQ, consistent with prior investigations using limited proteolysis assays. Additionally, p-FRQ exhibited greater conformational flexibility than np-FRQ, and CK1 and FRH were found in close proximity within the FFC. The study further demonstrated that under LLPS conditions in vitro, FRQ undergoes phase separation, encapsulating FRH and CK1 within LLPS droplets, ultimately diminishing CK1 activity within the FFC. Intriguingly, higher temperatures enhanced LLPS formation, suggesting a potential role of LLPS in the fungal clock's temperature compensation mechanism.

      Biological significance was supported by live imaging of Neurospora, revealing FRQ foci at the periphery of nuclei consistent with LLPS. The amino acid sequence of FRQ conferred LLPS properties, and a comparison of clock repressor protein sequences in other eukaryotes indicated that LLPS formation might be a conserved process within the negative arms of these circadian clocks.

      In summary, this manuscript represents a valuable advancement with solid evidence in the understanding of a circadian clock system that has proven challenging to characterize structurally due to obstacles linked to FRQ purification and stability. The implications of LLPS formation in the negative arm of other eukaryotic clocks and its role in temperature compensation are highly intriguing.

    4. Reviewer #2 (Public Review):

      Summary:<br /> This study presents data from a broad range of methods (biochemical, EPR, SAXS, microscopy, etc.) on the large disordered protein FRQ relevant to circadian clocks and its interaction partners FRH and CK1, providing novel and fundamental insight into oligomerization state, local dynamics, and overall structure as a function of phosphorylation and association. Liquid-liquid phase separation is observed. These findings have bearings on the mechanistic understanding of circadian clocks, and on functional aspects of disordered proteins in general.

      Strengths:<br /> This is a thorough work that is well presented. The data are of overall high quality given the difficulty of working with an intrinsically disordered protein, and the conclusions are sufficiently circumspect and qualitative to not overinterpret the mostly low-resolution data.

      Weaknesses:<br /> None

    5. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript from Tariq and Maurici et al. presents important biochemical and biophysical data linking protein phosphorylation to phase separation behavior in the repressive arm of the Neurospora circadian clock. This is an important topic that contributes to what is likely a conceptual shift in the field.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      The single-mutant and double-mutant crp/rpoB strains were made by co-transduction with a nearby gene deletion (kanR-marked). I couldn't tell from the methods section whether these mutants, e.g., crp-H22N delta-chiA, were compared to wild-type cells or deletion mutants, e.g., delta chiA, in the proteomics experiments. I encourage the authors to explain this more clearly in the methods section, and to briefly mention in the Results section and relevant figure legends that the crp/rpoB mutant strains (and possibly the "wild-type" strains) also have gene deletions. If the comparison "wild-type" strains are fully wild-type (i.e., not deleted for chiA/yjaH), it is especially important to mention this in the Results section and the figure legends since the phenotypic changes could be due to the gene deletions rather than the mutations in crp/rpoB

      We appreciate and agree with the editor's suggestion to clarify this point.

      Accordingly, we have made the following changes to the text:

      p11 L30-34 in the main text:

      "The second experiment similarly compared an engineered BW25113 (BW) strain, containing the two regulatory mutations from the compact set (i.e., crp H22N and rpoB A1245V) together with the deletions used to insert them (see methods and DataS1 file), to a “wild type” BW strain (a corresponding knockout strain without the mutations, see methods)."

      p28 under Chemostat proteomics experiment L13-16 in methods:

      "The starting volume of each bioreactor was 150 ml M9 media supplemented with either 30 mM and 10mM D-xylose for the evolved and ancestor samples or only 10mM D-xylose for BW including compact set mutations and/or the deletions used for their insertions (DataS1 file). The minimal media also included trace elements and vitamin B1 was omitted."

    2. Joint Public Review:

      The authors previously showed that expressing formate dehydrogenase, rubisco, carbonic anhydrase, and phosphoribulokinase in Escherichia coli, followed by experimental evolution, led to the generation of strains that can metabolise CO2. Using two rounds of experimental evolution, the authors identify mutations in three genes - pgi, rpoB, and crp - that allow cells to metabolise CO2 in their engineered strain background. The authors make a strong case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle. The authors also use proteomic analysis to probe the role of the mutations in crp and rpoB. While they do not reach strong conclusions about how these mutations promote autotrophic growth, they provide some clues, leading to valuable speculation.

      Comments on revised version:<br /> The authors have thoroughly addressed the reviewers' comments. The major addition to the paper is the proteomic analysis of single and double mutants of crp and rpoB. These new data provide clues as to the role of the crp and rpoB mutations in promoting autotrophic growth, which the authors discuss. The authors acknowledge that it will require additional experiments to determine whether the speculated mechanisms are correct. Nonetheless, the new data provide valuable new insight into the role of the crp and rpoB mutations. The authors have also expanded their description of the crp and rpoB mutations, making it clearer that the effects of these mutations are likely to be distinct, albeit with potential for overlap in function.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Su et al propose the existence of two mechanisms repressing SBF activity during entry into meiosis in budding yeast. First, a decrease in Swi4 protein levels by a LUTI-dependent mechanism where Ime1 would act closing a negative feedback loop. Second, the sustained presence of Whi5 would contribute to maintaining SBF inhibited under sporulation conditions. The article is clearly written and the experimental approaches used are adequate to the aims of this work. The results obtained are in line with the conclusions reached by the authors but, in my view, they could also be explained by the existing literature and, hence, would not represent a major advance in the field of meiosis regulation.

      We respectfully disagree with the reviewer about their comment that this work can be explained by the existing literature. First, while SWI4LUTI has been previously identified in meiotic cells along with ~ 380 LUTIs, the biological purpose of these alternative mRNA isoforms and their effect on cellular physiology still remain largely unknown. Our manuscript clarifies this gap in understanding for SWI4LUTI. Loss of SWI4LUTI contributes to dysregulation of meiotic entry and does so by failing to properly repress the known inhibitors of meiotic entry, the CLNs. Furthermore, even though Cln1 and Cln2 have been previously shown to antagonize meiosis, the mechanisms that restrict their activity was unclear prior to our study.

      We recognize work done by others demonstrating Whi5-dependent repression of SBF during mitotic G1/S transition (De Bruin et al., 2004; Costanzo et al., 2004). We further examined Whi5’s involvement during meiotic entry and found that it acts in conjunction with the LUTI-based mechanism to restrict SBF activity. Combined loss of both mechanisms results in the increased expression of G1 cyclins, decreased expression of early meiotic genes, and a delay in meiotic entry (Figure 6). Neither mechanism was previously known to regulate meiotic entry. Our study not only adds to our broader understanding of gene regulation during meiosis but also raises additional questions regarding how LUTIs regulate gene expression and function.

      Regarding the first mechanism, Fig 1 shows that Swi4 decreases very little after 1-2h in sporulation medium, whereas G1-cyclin expression is strongly repressed very rapidly under these conditions (panel D and work by others). This fact dampens the functional relevance of Swi4 downregulation as a causal agent of G1 cyclin repression.

      Reviewer 1 expresses concern for the observation that by 2 h in sporulation media there is a 32% decrease in Swi4-3V5 protein abundance compared to 0 h in SPO. This is consistent with the range of protein level decrease typically accomplished by LUTI-based gene regulation (Chen et al., 2017; Chia et al., 2017; Tresenrider et al., 2021), and while it is a modest reduction, it is consistent across replicates. Furthermore, we don’t make the argument that reduction in Swi4 levels alone is the sole regulator of G1 cyclin levels. In fact, we report that in addition to Swi4 downregulation, Whi5 also functions to restrict SBF activity during meiotic entry, thereby ensuring G1 cyclin repression.

      In addition, the LUTI-deficient SWI4 mutant does not cause any noticeable relief in CLN2 repression, arguing against the relevance of this mechanism in the repression of G1-cyclin transcription during entry into meiosis. The authors propose a second mechanism where Whi5 would maintain SBF inactive under sporulation conditions. The role of Whi5 as a negative regulator of the SBF regulon is well known. On the other hand, the double WHI5-AA SWI4-dLUTI mutant does not upregulate CLN2, the G1 cyclin with the strongest negative effect on sporulation, raising serious doubts on the functional relevance of this backup mechanism during entry into meiosis.

      Due to replicate variance, CLN2 did not make the cut by our mRNA-seq data analysis as a significant hit. To address reviewer 1’s final point we opted for the “gold standard” of reverse transcription coupled with qPCR to measure CLN2 transcript levels in the double mutant ∆LUTI; WHI5-AA and the wild-type control. This revealed that CLN2 levels were significantly increased in the double mutant compared to wild type at 2 h in SPO (Author Response Image 1, *, p = 0.0288, two-tailed t-test).

      Author response image 1.

      Wild type (UB22199) and ∆LUTI;WHI5-AA (UB25428) cells were collected to perform RT-qPCR for CLN2 transcript abundance. Transcript abundance was quantified using primer sets specific for each respective gene from three technical replicates for each biological replicate. Quantification was performed in reference to PFY1 and then normalized to wild-type control. FC = fold change. Experiments were performed twice using biological replicates, mean value plotted with range. Differences in wild type versus ∆LUTI; WHI5-AA transcript levels compared with a two-tailed t-test (*, p = 0.0288)

      Reviewer #2 (Public Review):

      Summary:

      The manuscript highlights a mechanistic insight into meiotic initiation in budding yeast. In this study, the authors addressed a genetic link between mitotic cell cycle regulator SBF (the Swi4-Swi6 complex) and a meiosis inducing regulator Ime1 in the context of meiotic initiation. The authors' comprehensive analyses with cytology, imaging, RNA-seq using mutant strains lead the authors to conclude that Swi4 levels regulates Ime1-Ume6 interaction to activate expression of early meiosis genes for meiotic initiation. The major findings in this paper are that (1) the higher level of Swi4, a subunit of SBF transcription factor for mitotic cell cycle regulation, is the limiting factor for mitosis-to-meiosis transition; (2) G1 cyclins (Cln1, Cln2), that are expressed under SBF, inhibit Ime1-Ume6 interaction under overexpression of SWI4, which consequently leads to downregulation of early meiosis genes; (3) expression of SWI4 is regulated by LUTI-based transcription in the SWI4 locus that impedes expression of canonical SWI4 transcripts; (4) expression of SWI4 LUTI is likely negatively regulated by Ime1; (5) Action of Swi4 is negatively regulated by Whi5 (homologous to Rb)-mediated inhibition of SBF, which is required for meiotic initiation. Thus, the authors proposed that meiotic initiation is regulated under the balance of mitotic cell cycle regulator SBF and meiosis-specific transcription factor Ime1.

      Strengths:

      The most significant implication in their paper is that meiotic initiation is regulated under the balance of mitotic cell cycle regulator and meiosis-specific transcription factor. This finding will provide a mechanistic insight in initiation of meiosis not only into the budding yeast also into mammals. The manuscript is overall well written, logically presented and raises several insights into meiotic initiation in budding yeast. Therefore, the manuscript should be open for the field. I would like to raise the following concerns, though they are not mandatory to address. However, it would strengthen their claims if the authors could technically address and revise the manuscript by putting more comprehensive discussion.

      Weaknesses:

      The authors showed that increased expression of the SBF targets, and reciprocal decrease in expression of meiotic genes upon SWI4 overexpression at 2 h in SPO (Figure 2F). However, IME1 was not found as a DEG in Supplemental Table 1. Meanwhile, IME1 transcript level was decreased at 2 h SPO condition in pATG8-CLN2 cells in Fig S4C.

      Now this reviewer still wonders with confusion whether expression of IME1 transcripts per se is directly or in directly suppressed under SBF-activated gene expression program at 2 h SPO in pATG8-SWI4 and pATG8-CLN2 cells. This reviewer wonders how Fig S4C data reconciles with the model summarized in Fig 6F.

      One interpretation could be that persistent overexpression of G1 cyclin caused active mitotic cell cycle, and consequently delayed exit from mitotic cell cycle, which may have given rise to an apparent reduction of cell population that was expressing IME1. For readers to better understand, it would be better to explain comprehensively this issue in the main text.

      We believe there was an oversight here. In supplemental table 1, IME1 expression is reported as significantly decreased. The volcano plot shown below also highlights this change (Author response image 2).

      Author response image 2.

      Volcano plot of DE-Seq2 analysis for ∆LUTI;WHI5-AA versus wild type. Dashed line indicates padj (p value) = 0.05. Analysis was performed using mRNA-seq from two biological replicates. Wild type (UB22199) and ∆LUTI;WHI5-AA (UB25428) cells were collected at 2 h in SPO. SBF targets (pink) (Iyer et al., 2001) and early meiotic genes (blue) defined by (Brar et al., 2012). Darker pink or darker blue, labeled dots are well studied targets in either gene set list.

      The % of cells with nuclear Ime1 was much reduced in pATG8-CLN2 cells (Fig 2B) than in pATG8-SWI4 cells (Fig 4C). Is the Ime1 protein level comparable or different between pATG8-CLN2 strain and pATG8-SWI4 strain? Since it is difficult to compare the quantifications of Ime1 levels in Fig S1D and Fig S4B, it would be better to comparably show the Ime1 protein levels in pATG8-CLN2 and pATG8-SWI4 strains.

      Further, it is uncertain how pATG8-CLN2 cells mimics the phenotype of pATG8-SWI4 cells in terms of meiotic entry. It would be nice if the authors could show RNA-seq of pATG8-CLN2/WT and/or quantification of the % of cells that enter meiosis in pATG8-CLN2.

      Analyzing bulk Ime1 protein levels across a population of cells (Author response image 3) reveals that overexpression of CLN2 causes a more severe decrease in Ime1 levels than overexpression of SWI4. This is consistent with our observation that pATG8-CLN2 has a more severe impact on meiotic entry than pATG8-SWI4. The higher CLN2 levels (Author response image 4) likely accounts for the observed difference in severity of phenotype between the two mutants.

      Author response image 3.

      Samples from strain wild type (UB22199), pATG8-SWI4 (UB2226), pATG8-CLN2 (UB25959) and were collected between 0-4 hours (h) in sporulation medium (SPO) and immunoblots were performed using α-GFP. Hxk2 was used a loading control.

      Author response image 4.

      Wild type (UB22199), pATG8-SWI4 (UB2226), pATG8-CLN2 (UB25959) cells were collected to perform RT-qPCR for CLN2 transcript abundance. Quantification was performed in reference to PFY1 and then normalized to wild-type control. FC = fold change.

      The authors stated that reduced Ime1-Ume6 interaction is a primary cause of meiotic entry defect by CLN2 overexpression (Line 320-322, Fig 4J-L). This data is convincing. However, the authors also showed that GFP-Ime1 protein level was decreased compared to WT in pATG8-CLN2 cells by WB (Fig S4A).

      Compared to wild type, pATG8-CLN2 cells have lower levels of Ime1. Consequently, reviewer 2 suggests that this reduction may be responsible for the observed meiotic defect. However, we tested this possibility and found it not to be the primary cause of the meiotic defect in pATG8-CLN2 cells. As shown in Figure S4A, when IME1 was overexpressed from the pCUP1 promoter, Ime1 protein levels were similar between wild-type and pATG8-CLN2 cells. Despite this similarity, we still observed a decrease in nuclear Ime1 (Figure 4F) and no rescue in sporulation (Figure 4A). Therefore, the reduction in Ime1 protein levels alone cannot explain the meiotic defect caused by CLN2 overexpression.

      Further, GFP-Ime1 signals were overall undetectable through nuclei and cytosol in pATG8-CLN2 cells (Fig 4B), and accordingly cells with nuclear Ime1 were reduced (Fig 4C). Although the authors raised a possibility that the meiotic entry defect in the pATG8-CLN2 mutant arises from downregulation of IME1 expression (Line 282-283), causal relationship between meiotic entry defect and CLN2 overexpression is still not clear.

      As reviewer 2 comments, we initially considered the possibility that meiotic entry defect induced by CLN2 overexpression could be attributed to decreased IME1 expression. However, in the following paragraph in the manuscript, we demonstrate equalizing IME1 transcript levels using the pCUP1-IME1 allele does not rescue the meiotic defect caused by CLN2 overexpression. Consequently, we conclude that the decrease in IME1 transcript levels alone cannot explain the meiotic defect caused by increased CLN2 levels.

      Is the Ime1 protein level reduced in the pATG8-CLN2;UME6-⍺GFP strain compared to WT? It would be better to comparably show the Ime1 protein levels in the pATG8-CLN2 strain and the pATG8-CLN2;UME6-⍺GFP strain by WB. Also, it would be nice if the authors could show quantification of the % of cells that enter meiosis in the pATG8-CLN2;UME6-⍺GFP strain to see how and whether artificial tethering of Ime1 to Ume6 rescued normal meiosis program rather than simply showing % sporulation in Fig4A.

      We do not agree with the suggestion to compare the pATG8-CLN2;UME6-⍺GFP with wild type as the kinetics of meiosis is rather different. The more appropriate comparison is UME6-⍺GFP and pATG8-CLN2;UME6-⍺GFP which shows GFP-Ime1 bulk protein levels are slightly lower (Author response image 5). However, when we use a more sensitive measurement of meiotic entry through the nuclear accumulation of Ime1 in single cells, as illustrated in Figure 4L, it becomes evident that the Ume6-Ime1 tether is capable of restoring nuclear Ime1 levels, even in the presence of CLN2 overexpression. Given that these cells exhibited wild type levels of nuclear Ime1 and underwent sporulation after 24 hours, we make the fair assumption that they have successfully initiated the meiotic program.

      Author response image 5.

      Wild type (UB22199), pATG8-SWI4 (UB35106), UME6-⍺GFP (UB35300), and UME6-⍺GFP; pATG8-CLN2 (UB35177) cells collected between 0-3 hours (h) in sporulation medium (SPO) and immunoblots were performed using α-GFP. Hxk2 was used a loading control

      The authors showed Ume6 binding at the SWI4LUTI promoter (Figure 5K). However, since Ume6 forms a repressive form with Rpd3 and Sin3a and binds to target genes independently of Ime1, Ume6 binding at the SWI4LUTI promoter bind does not necessarily represent Ime1-Ume6 binding there. Instead, it would be better to show Ime1 ChIP-seq at the SWI4LUTI promoter.

      We agree with reviewer 2 that Ime1 ChIP would be the ideal measurement. Unfortunately, this has proved to be technically challenging. To address this limitation, we utilized a published Ume6 ChIP-seq dataset along with a published UME6-T99N RNA-seq dataset. Cells carrying the UME6-T99N allele are unable to induce the expression of early meiotic transcripts due to lack of Ime1 binding to Ume6 (Bowdish et al., 1995). Accordingly, RNA-seq analysis should reveal whether or not the LUTIs identified by Ume6 ChIP are indeed regulated by Ime1-Ume6 during meiosis. For SWI4LUTI, this is exactly what we observe. Not only is there Ume6 binding at the SWI4LUTI promoter (Figure 5K), but there is also a significant decrease in SWI4LUTI expression in UME6-T99N cells under meiotic conditions (Figure S5). Based on these data, we conclude that the Ime1-Ume6 complex is responsible for regulating SWI4LUTI expression during meiosis.

      The authors showed ∆LUTI mutant and WHI5-AA mutant did not significantly change the expression of SBF targets nor early meiotic genes relative to wildtype (Figure 6A, C). Accordingly, they concluded that LUTI- or Whi5-based repression of SBF alone was not sufficient to cause a delay in meiotic entry (Line451-452), and perturbation of both pathways led to a significant delay in meiotic entry (Figure 6E). This reviewer wonders whether Ime1 expression level and nuclear localization of Ime1 was normal in ∆LUTI mutant and WHI5-AA mutant.

      Based on our observations in Figure 4, Ime1 protein and expression levels were not reliable indicators of meiotic entry. Consequently, we opted for a more downstream and functionally relevant measure of meiotic entry, which involved time-lapse fluorescence imaging of Rec8, an Ime1 target.

      Reviewer #1 (Recommendations For The Authors):

      The authors would like to mention previous work showing that G1-cyclin overexpression decreases the expression and nuclear accumulation of Ime1 (Colomina et al 1999 EMBO J 18:320). In this work, the interaction between Ime1 and Ume6 had been found to be resistant to G1-cyclin expression, arguing against a direct effect on the recruitment of Ime1 at meiotic promoters. Alternatively, differences in the experimental approaches used could be discussed to explain this apparent discrepancy.

      To clarify, in the paper that reviewer 1 is referring to (Colomina et al., 1999), the authors determine that the interaction between Ime1 and Ume6 is regulated by the presence of a non-fermentable carbon source. Additional work by others reveals that Ime1 undergoes phosphorylation by the protein kinases Rim11 and Rim15, promoting its nuclear localization and enabling interaction with Ume6 (Vidan and Mitchell, 1997; Pnueli et al., 2004; Malathi et al., 1999, 1997). Furthermore, both Rim11 and Rim15 kinase activities are inhibited by the presence of glucose via the PKA pathway (Pedruzzi et al., 2003; Rubin-Bejerano et al., 2004; Vidan and Mitchell, 1997). Accordingly, the elimination of cyclins in the presence of a non-fermentable carbon source (glucose) in (Colomina et al., 1999) is unlikely to result in an interaction between Ime1 and Ume6, as Rim11 and Rim15 remain repressed. Removal of cyclins in acetate does not further increase Ime1-Ume6 interaction leading the authors to conclude that G1 cyclins do not block Ime1 function through its interaction with Ume6. This work however uses loss of function (removal of G1 cyclins) to study the G1 cyclins’ effect on Ime1-Ume6 interaction while using timepoints that are well beyond meiotic entry. Additionally, Ime1-Ume6 interaction is being tested using yeast-two hybrid analysis with just the proposed interaction domain of Ime1 (amino acids 270-360). Therefore, the interpretation that G1 cyclins are dispensable for regulating the interaction between Ime1 and Ume6 is unclear from this work alone.

      There are many differences that can explain the discrepancy between our work and (Colomina et al., 1999). Our work uses increased expression of cyclins during meiotic entry. Additionally, in our study, we collected timepoints to measure meiotic entry (2 h in SPO) and sporulation (gamete formation) efficiency (24 h in SPO). Finally, we are using the endogenous, full length Ime1. These differences could very well explain the discrepancy with previous work. Lastly, in our discussion we acknowledge the lack of CDK consensus phosphorylation sites on Ime1. Therefore, it is most likely that G1 cyclins are not directly phosphorylating Ime1 and that other factors like Rim11 and Rim15 could be direct targets of the G1 cyclins, considering their involvement in the phosphorylation of Ime1-Ume6, as well as their role in regulating Ime1 localization and its interaction with Ume6. We have included these points in the revised manuscript (lines 547-551).

      Reviewer #2 (Recommendations For The Authors):

      This reviewer thinks that the findings in this paper are of general interest to meiosis field and help understanding the mechanism of meiotic initiation in mammals. The way of the current manuscript seems to be written for limited budding yeast scientists, and should not limited to the interest by the budding yeast scientists. Thus, it would be better to discuss more about what is known about the mechanism of initiation of meiosis not only in budding yeast but also in other species to share their finding to more broad scientists using other organisms.

      We appreciate reviewer 2’s comment and have added more discussion about the parallels between yeast and mammalian systems in meiotic initiation (lines 613-624).

      Reviewer #3 (Recommendations For The Authors):

      The effect of overexpression of Swi4 is tested for MI and MII (Fig1F): this is a very indirect readout of meiotic entry. The authors could present Rec8 localization (Fig2I) at this stage. However, this is still a superficial description of the meiotic phenotype: is the phenotype only a delay or is the meiotic prophase altered. It is specifically important to analyse this in more detail to answer whether the overexpression of Swi4 leads to an identical phenotype to the one of CLN2. Also the comparison between overexpression of Swi4 and Cln2 is difficult to evaluate: what is the level of CLN2 when SwI4 is overexpressed compared to CLN2 overexpression. The percentage of nuclear Ime1 is 50% vs 5% when Swi4 or Cln2 are overexpressed. What is the interpretation? What are the levels of Ime1? (Y axis of quantifications not comparable, see also comment for Fig5F,H)

      CLN2 is expressed at a much higher level in pATG8-CLN2 cells relative to pATG8-SWI4 (Author Response Image 4). Therefore, we don’t expect identical phenotypes, but rather a more severe deficiency in meiotic entry upon CLN2 overexpression. The key experiment that establishes causality between SWI4 and CLNs is reported in Figure 3, where deletion of either CLN1 or CLN2 rescues the meiotic entry delay exerted by SWI4 overexpression.

      Fig3EF: What is the phenotype of Cln1 and Cln2 without overexpression of Swi4?

      Meiotic entry is not faster in cln1∆ or cln2∆ cells compared to wild-type. We included these data in Supplemental Figure 3 and made the relevant changes in the manuscript (lines 257-261).

      Fig4F: Need a control with CLN2 overexpression only.

      A control with only CLN2 overexpression (pATG8-CLN2) is not appropriate since these meiotic time course experiments are synchronized using the pCUP1-IME1 allele. It would be a misleading comparison since the two meiosis would have different kinetics. Figure 4F reports that despite similar IME1 transcript levels and Ime1 protein levels, CLN2 overexpressing cells still have reduced nuclear Ime1. Since side-by-side comparison of pATG8-CLN2 and pCUP1-IME1 is not possible, we chose to measure sporulation efficiency at 24 h in Figure 4A. These data together suggest that elevated IME1 transcript and protein levels cannot rescue the defects associated with increased CLN2 expression.

      Fig5E: in wild type, by Northern blot, Swi4canon level is increasing during meiosis, not decreasing?, whereas protein level is decreasing, what is the interpretation?

      Northern data is less quantitative than smFISH, which show that SWI4canon transcript levels are significantly lower in meiosis compared to vegetative cells (Figure 5D). We also note that the Northern blot data were acquired from unsynchronized meiotic cells and could have additional limitations based on the population-based nature of the assay. Finally, additional analysis of a transcript leader sequencing (TL-seq) dataset from synchronized cells (Tresenrider et al., 2021) further confirms the decrease in SWI4canon transcript levels upon meiotic entry. (Author response image 6).

      Author response image 6.

      TL-seq data from (Tresenrider et al. 2021) visualized on IGV at the SWI4 locus. Two timepoints are plotted including premeiotic before IME1 induction (pink) and meiotic prophase or after IME1 induction (blue).

      Fig5F, H. This quantification needs duplicates for validation.

      Replicates are submitted for every blot in this paper to eLIFE.It can be found in the shared Dropbox folder to the editors (named Raw-blots-for-eLIFE).

      Fig5F, H. Why are the wild type values so different?

      The immunoblotting done between Figure 5F and Figure 5H are on separate blots and therefore should not be compared. Additionally, these values are not absolute measurements of wild type values of Swi4-3V5 and therefore we should not expect them to be the same. Any comparisons done of relative amounts of Swi4-3V5 are always done on the same blot and normalized to a loading control, hexokinase.

      FigS5: What is the effect of the Ume6-T99N on Swi4 protein level and on meiotic entry? Is the backup mechanism proposed active?

      We haven’t measured Swi4 protein levels in the UME6-T99N background but given that this mutation is known to disrupt the interaction between Ime1 and Ume6, we expect a similar trend to that reported in Figure 5I (pCUP1-IME1 uninduced).

      What is the evidence that Swi4/6 is a E2F homolog? What is the homology at the protein level?

      While there is no sequence homology between SBF and E2F there is remarkable similarity between metazoans and yeast in terms of the regulation of the G1/S transition (reviewed in Bertoli et al., 2013). E2F and SBF are both repressed before the G1/S transition by the inhibitors Rb and Whi5, respectfully (Costanzo et al., 2004; De Bruin et al., 2004; Hasan et al., 2014). During G1/S transition, a cyclin dependent kinase phosphorylates and inactivates these inhibitors. We have carefully edited our language in the manuscript to “functional homology” instead of just “homology”.

      FigS3 is missing

      Each supplemental figure was matched to its corresponding main figure. In the original submission, we didn’t have Figure S3. However, the revised manuscript now contains FigS3.

      Bertoli, C., J.M. Skotheim, and R.A.M. De Bruin. 2013. Control of cell cycle transcription during G1 and S phases. Nat. Rev. Mol. Cell Biol. 14:518–528. doi:10.1038/nrm3629.

      Bowdish, K.S., H.E. Yuan, and A.P. Mitchell. 1995. Positive control of yeast meiotic genes by the negative regulator UME6. Mol. Cell. Biol. 15:2955–2961. doi:10.1128/mcb.15.6.2955.

      Brar, G.A., M. Yassour, N. Friedman, A. Regev, N.T. Ingolia, and J.S. Weissman. 2012. High-Resolution View of the Yeast Meiotic Program Revealed by Ribosome Profiling. Science (80-. ). 335:552–558. doi:10.1126/science.1215110.

      De Bruin, R.A.M., W.H. McDonald, T.I. Kalashnikova, J. Yates, and C. Wittenberg. 2004. Cln3 activates G1-specific transcription via phosphorylation of the SBF bound repressor Whi5. Cell. 117:887–898. doi:10.1016/j.cell.2004.05.025.

      Chen, J., A. Tresenrider, M. Chia, D.T. McSwiggen, G. Spedale, V. Jorgensen, H. Liao, F.J. Van Werven, and E. Ünal. 2017. Kinetochore inactivation by expression of a repressive mRNA. Elife. 6:1–31. doi:10.7554/eLife.27417.

      Chia, M., A. Tresenrider, J. Chen, G. Spedale, V. Jorgensen, E. Ünal, and F.J. van Werven. 2017. Transcription of a 5’ extended mRNA isoform directs dynamic chromatin changes and interference of a downstream promoter. Elife. 6:1–23. doi:10.7554/eLife.27420.

      Colomina, N., E. Garí, C. Gallego, E. Herrero, and M. Aldea. 1999. G1cyclins block the Ime1 pathway to make mitosis and meiosis incompatible in budding yeast. EMBO J. 18:320–329. doi:10.1093/emboj/18.2.320.

      Costanzo, M., J.L. Nishikawa, X. Tang, J.S. Millman, O. Schub, K. Breitkreuz, D. Dewar, I. Rupes, B. Andrews, and M. Tyers. 2004. CDK activity antagonizes Whi5, an inhibitor of G1/S transcription in yeast. Cell. 117:899–913. doi:10.1016/j.cell.2004.05.024.

      Hasan, M., S. Brocca, E. Sacco, M. Spinelli, P. Elena, L. Matteo, A. Lilia, and M. Vanoni. 2014. A comparative study of Whi5 and retinoblastoma proteins : from sequence and structure analysis to intracellular networks. 4:1–24. doi:10.3389/fphys.2013.00315.

      Iyer, V.R., C.E. Horak, P.O. Brown, D. Botstein, V.R. Iyer, M. Snyder, and C.S. Scafe. 2001. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature. 409:533–538. doi:10.1038/35054095.

      Malathi, K., Y. Xiao, and A.P. Mitchell. 1997. Interaction of yeast repressor-activator protein Ume6p with glycogen synthase kinase 3 homolog Rim11p. Mol. Cell. Biol. 17:7230–7236. doi:10.1128/mcb.17.12.7230.

      Malathi, K., Y. Xiao, and A.P. Mitchell. 1999. Catalytic roles of yeast GSK3β/shaggy homolog Rim11p in meiotic activation. Genetics. 153:1145–1152. doi:10.1093/genetics/153.3.1145.

      Pedruzzi, I., F. Dubouloz, E. Cameroni, V. Wanke, J. Roosen, J. Winderickx, and C. De Virgilio. 2003. TOR and PKA Signaling Pathways Converge on the Protein Kinase Rim15 to Control Entry into G0. Mol. Cell. 12:1607–1613. doi:10.1016/S1097-2765(03)00485-4.

      Pnueli, L., I. Edry, M. Cohen, and Y. Kassir. 2004. Glucose and Nitrogen Regulate the Switch from Histone Deacetylation to Acetylation for Expression of Early Meiosis-Specific Genes in Budding Yeast. Mol. Cell. Biol. 24:5197–5208. doi:10.1128/mcb.24.12.5197-5208.2004.

      Rubin-Bejerano, I., S. Sagee, O. Friedman, L. Pnueli, and Y. Kassir. 2004. The In Vivo Activity of Ime1, the Key Transcriptional Activator of Meiosis-Specific Genes in Saccharomyces cerevisiae, Is Inhibited by the Cyclic AMP/Protein Kinase A Signal Pathway through the Glycogen Synthase Kinase 3- Homolog Rim11. Mol. Cell. Biol. 24:6967–6979. doi:10.1128/mcb.24.16.6967-6979.2004.

      Tresenrider, A., K. Morse, V. Jorgensen, M. Chia, H. Liao, F.J. van Werven, and E. Ünal. 2021. Integrated genomic analysis reveals key features of long undecoded transcript isoform-based gene repression. Mol. Cell. 81:2231-2245.e11. doi:10.1016/j.molcel.2021.03.013.

      Vidan, S., and A.P. Mitchell. 1997. Stimulation of yeast meiotic gene expression by the glucose-repressible protein kinase Rim15p. Mol. Cell. Biol. 17:2688–2697. doi:10.1128/mcb.17.5.2688.

    2. eLife assessment

      This study highlights several important regulatory pathways that contribute to the control of entry into meiosis by turning down mitotic functions. Central to this regulation is the control of Swi4 level and activity, and convincing overexpression experiments identify downstream effectors of Swi4.

    3. Joint Public Review:

      The manuscript highlights a mechanistic insight into meiotic initiation in budding yeast. In this study, the authors analyzed the genetic link between the mitotic cell cycle regulator SBF (the Swi4-Swi6 complex) and a meiosis inducing regulator Ime1 in the context of meiotic initiation. The authors' comprehensive analyses with cytology, imaging, RNA-seq using mutant strains lead to the conclusion that Swi4 levels regulates Ime1-Ume6 interaction to activate expression of early meiosis genes for meiotic initiation.

      The authors first show a down regulation of Swi4 at the protein level upon meiosis entry and then investigate downstream consequences. This study reveals several regulations: 1) Mutations in CLN1 and 2, which are targets of Swi4, allow rescuing the delay in meiotic entry observed when Swi4 is overexpressed; 2) Ime1 activity is antigonized by Swi4, and more specifically its interaction with Ume6. 3) Expression of SWI4 is regulated by LUTI-based transcription at the SWI4 locus that impedes expression of canonical SWI4 transcripts 4) The expression of SWI4 LUTI is likely negatively regulated by the Ime1-Ume6 complex 5) Whi5 restrict SBF activity during meiotic entry, thereby ensuring Cyclin repression.

      The important implication in this paper is that meiotic initiation is regulated by the balance of mitotic cell cycle regulator and meiosis-specific transcription factor.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Sender et al describe a model to estimate what fraction of DNA becomes cell-free DNA in plasma. This is of great interest to the community, as the amount of DNA from a certain tissue (for example, a tumor) that becomes available for detection in the blood has important implications for disease detection.

      However, the authors' methods do not consider important variables related to cell-free DNA shedding and storage, and their results may thus be inaccurate. At this stage of the paper, the methods section lacks important detail. Thus, it is difficult to fully assess the manuscript and its results.

      Strengths:

      The question asked by the authors has potentially important implications for disease diagnosis. Understanding how genomic DNA degrades in the human circulation can guide towards ways to enrich for DNA of interest or may lead to unexpected methods of conserving cell-free DNA. Thus, the question "how much genomic DNA becomes cfDNA" is of great interest to the scientific and medical community. Once the weaknesses of the manuscript are addressed, I believe this manuscript has the potential to be a widely used resource.

      Weaknesses:

      There are two major weaknesses in how the analysis is presented. First, the methods lack detail. Second, the analysis does not consider key variables in their model.

      Issues pertaining to the methods section.

      The current manuscript builds a flux model, mostly taking values and results from three previous studies: 1) The amount of cellular turnover by cell type, taken from Sender & Milo, 2021

      2) The fractions of various tissues that contribute DNA to the plasma, taken from Moss et al, 2018 and Loyfer et al, 2023

      My expertise lies in cell-free DNA, and so I will limit my comments to the manuscripts in (2). Paper by Loyfer et al (additional context):

      Loyfer et al is a recent landmark paper that presents a computational method for deconvoluting tissues of origin based on methylation profiles of flow-sorted cell types. Thus, the manuscript provides a well-curated methylation dataset of sorted cell-types. The majority of this manuscript describes the methylation patterns and features of the reference methylomes (bulk, sorted cell types), with a smaller portion devoted to cell-free DNA tissue of origin deconvolution.

      I believe the data the authors are retrieving from the Loyfer study are from the 23 healthy plasma cfDNA methylomes analyzed in the study, and not the re-analysis of the 52 COVID-19 samples from Cheng et al (MED 2021).

      Paper by Moss et al (additional context):

      Moss et al is another landmark paper that predates the Loyfer et al manuscript. The technology used in this study (methylation arrays) is outdated but is an incredible resource for the community. This paper evaluates cfDNA tissues of origin in health and different disease scenarios. Again, I assume the current manuscript only pulled data from healthy patients, although I cannot be sure as it is not described in the methods section.

      This manuscript:

      The current manuscript takes (I think) the total cfDNA concentration from males and females from the Moss et al manuscript (pooled cfDNA; 2 young male groups, 2 old male groups, 2 young female groups, 2 old female groups, Supplementary Dataset; "total_cfDNA_conc" tab). I believe this is the data used as total cfDNA concentration. It would be beneficial for all readers if the authors clarified this point.

      The tissues of origin, in the supplemental dataset ("fraction" tab), presents the data from 8 cell types (erythrocytes, monocytes/macrophages, megakaryocytes, granulocytes, hepatocytes, endothelial cells, lymphocytes, other). The fractions in the spreadsheet do not match the Loyfer or Moss manuscripts for healthy individuals. Thus, I do not know what values the supplementary dataset represents. I also don't know what the deconvolution values are used for the flux model.

      The integration of these two methods lack detail. Are the authors here using yields (ie, cfDNA concentrations) from Moss et al, and tissue fractions from Loyfer et al? If so, why? There are more samples in the Loyfer manuscript, so why are the samples from Moss et al. being used? The authors are also selectively ignoring cell-types that are present in healthy individuals (Neurons from Moss et al, 2018). Why?

      Appraisal:

      At this stage of the manuscript, I think additional evidence and analysis is required to confirm the results in the manuscript.

      Impact:

      Once the authors present additional analysis to substantiate their results, this manuscript will be highly impactful on the community. The field of liquid biopsies (non-invasive diagnostics) has the potential to revolutionize the medical field (and has already in certain areas, such as prenatal diagnostics). Yet, there is a lack of basic science questions in the field. This manuscript is an important step forward in asking more "basic science" questions that seek to answer a fundamental biological question.

      We thank the reviewer for the valuable comments on our analysis. In response to the feedback, we have updated the analysis to address all critical points as described below and revised the text to enhance the clarity of our methodology. One notable improvement to our analysis involved ensuring better alignment between the cohort data for cfDNA plasma concentration and cell turnover estimates. To achieve this, we utilized the total plasma concentration of cfDNA from a study conducted by Meddeb et al. 2019, taking into account the influence of age and sex on these concentrations and specifically focusing on a cohort of relatively young and healthy individuals. Additionally, we considered expected variations related to sex, age, and other pertinent factors, as outlined in the studies by Meddeb et al. 2019 and Madsen et al. 2019.

      In addition, we have addressed concerns regarding the technical aspects of cfDNA analysis, providing detailed explanations of their limited impact on our analysis and the resulting conclusions.

      Reviewer #2 (Public Review):

      Summary:

      Cell-free DNA (cfDNA) are short DNA fragments released into the circulation when cells die. Plasma cfDNA level is thought to reflect the degree of cell-death or tissue injury. Indeed, plasma cfDNA is a reliable diagnostic biomarker for multiple diseases, providing insights into disease severity and outcomes. In this manuscript, Dr. Sender and colleagues address a fundamental question: What fraction of DNA released from cell death is detectable as plasma cfDNA? The authors use public data to estimate the amount of DNA produced from dying cells. They also utilize public data to estimate plasma cfDNA levels. Their calculations showed that <10% of DNA released is detectable as plasma cfDNA, the fraction of detectable cfDNA varying by tissue sources. The study demonstrates new and fundamental principles that could improve disease diagnosis and treatment via cfDNA.

      Strengths:

      1) The experimental approach is resource-mindful taking advantage of publicly available data to estimate the fraction of detectable cfDNA in physiological states. The authors did not assess if the fraction of detectable cfDNA changes in disease conditions. Nonetheless, their pioneering study lays the foundation and provides the methods needed for a similar assessment in disease states.

      2) The findings of this study potentially explain discrepancies in measured versus expected tissue-specific cfDNA from some tissues. For example, the gastrointestinal tract is subject to high cell turnover and release of DNA. Yet, only a small fraction of that DNA ends up in plasma as gastrointestinal cfDNA.

      3) The study proposes potential mechanisms that could account for the low fraction of detectable cfDNA in plasma relative to DNA released. This includes intracellular or tissue machinery that could "chew up" DNA released from dying cells, allowing only a small fraction to escape into plasma as cfDNA. Could this explain why the gastrointestinal track with an elaborate phagosome machinery contributes a small fraction of plasma cfDNA? Given the role of cfDNA as damage-associated molecular pattern in some diseases, targeting such a machinery may provide novel therapeutic opportunities.

      Weaknesses:

      In vitro and in vivo studies are needed to validate these findings and define tissue machinery that contribute to cfDNA production. The validation studies should address the following limitations of the study design: -

      1) Align the cohorts to estimate DNA production and plasma cfDNA levels. Cellular turnover rate and plasma cfDNA levels vary with age, sex, circadian clock, and other factors (Madsen AT et al, EBioMedicine, 2019). This study estimated DNA production using data abstracted from a homogenous group of healthy control males (Sender & Milo, Nat Med 2021). On the other hand, plasma cfDNA levels were obtained from datasets of more diverse cohort of healthy males and females with a wide range of ages (Loyfer et al. Nature, 2023 and Moss et al., Nat Commun, 2018).

      2) "cfDNA fragments are not created equal". Recent studies demonstrate that cfDNA composition vary with disease state. For example, cfDNA GC content, fraction of short fragments, and composition of some genomic elements increase in heart transplant rejection compared to no-rejection state (Agbor-Enoh, Circulation, 2021). The genomic location and disease state may therefore be important factors to consider in these analyses.

      3) Alternative sources of DNA production should be considered. Aside from cell death, DNA can be released from cells via active secretion. This and other additional sources of DNA should be considered in future studies. The distinct characteristics of mitochondrial DNA to genomic DNA should also be considered.

      We appreciate the reviewer's comments on our analysis. In response to the feedback, we have updated to address key points and revised the text accordingly.

      1) We have incorporated several enhancements to improve the coherence of our analysis. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      There was no specific estimate for a cohort of young males in both Meddeb et al. and Loyfer et al.; however, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in literature (Meddeb et al. 2019; Madsen et al. 2019). Thus, we demonstrate that sex and age have a small effect on the cfDNA concentrations and thus are unlikely to alter our conclusions substantially when considering a healthy population. We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion.

      2) In this study, we addressed the total amount of cfDNA in healthy individuals without regard to GC content, representation of different genomic regions, or fragment length, as the goal was to understand if cell death rates are fully accounted for by cfDNA concentration. We agree that it will be interesting to study the relative representation of the genome in cfDNA and the processes that determine cfDNA concentration in pathologies beyond the rate of cell death. These topics for future research fall beyond this study's scope.

      3) We know only a few specific cases whereby DNA is released from cells that are not dying. These include the release of DNA from erythroblasts and megakaryocytes to generate anucleated erythrocytes and platelets (Moss et al. 2022, cited in our paper) and the release of NETs from neutrophils.

      The presence of cfDNA fragments originating from megakaryocytes and erythroblasts indicates the elimination of megakaryocytes and erythroblasts and the birth of erythrocytes and platelets. However, the considerations in the rest of the paper still apply: the concentration of cfDNA from these sources is far lower than expected from the cell turnover rate.

      Concerning NETosis: the presence of cfDNA originating in neutrophils that have not died would reduce the concentration of cfDNA from dying neutrophils and thus further increase the discrepancy, which is the topic of our study (under-representation of DNA from dying cells in plasma).

      We neglected mitochondrial DNA, as it is not measured in methylation cell-of-origin analysis. Similarly to the argument above, if some of the total DNA measured in plasma is in fact, mitochondrial, this would mean that genomic cfDNA concentration is actually lower than the estimates, meaning that an even smaller fraction of DNA from dying cells is measured in plasma.

      Recommendations For The Authors

      Reviewer #1 (Recommendations For The Authors):

      I think readers would appreciate the authors commenting or addressing the following points, in addition to addressing the concerns I raised about the methods section in the public review:

      What variables and considerations did the authors omit in this study?

      1) Cell-free DNA is found in virtually every biofluid.

      Thus, the fact that cell-free DNA is not present in the plasma does not mean it cannot be detected elsewhere. This also implies that phagocytosis may not be the only factor related to cfDNA not being present in the blood. One example (of many, many others) is neutrophil-derived cell-free DNA, which is present in the urine.

      Indeed, dying cells and their DNA can be consumed locally, released into the blood, or shed outside the body. The latter is a function of tissue topology. For example, intestinal epithelial cell turnover releases material to the lumen of the gut (i.e., stool); kidney and bladder cell turnover releases material to urine; and lung epithelium releases material to the air spaces. In these cases, the absence of cfDNA in plasma is expected. However, in cases where tissue topology dictates release to blood, low representation in cfDNA indicates local consumption or a related mechanism. In Figure 1 of the manuscript, we distinguish between tissues according to their topology, labeling organs that shed material to the outside denoted by open circles.

      Neutrophil-derived DNA in urine likely represents a local process in the kidney (neutrophils that penetrate the epithelium and fall into the urine). Neutrophils that die elsewhere in the body must release cfDNA to the blood before it can reach the urine. Hence, quantifying plasma cfDNA is a legitimate approach for assessing the relationship between cell death and cfDNA. The revised text clarifies this point. We made revisions to the initial paragraph in the results section and a paragraph within the discussion to provide clarity on this topic:

      “Based on atlases of human cell type-specific methylation signatures, Moss et al. and Loyfer et al. analyzed the main cell types contributing to plasma cfDNA. They found the primary sources of plasma cfDNA to be blood cells: granulocytes, megakaryocytes, macrophages, and/or monocytes (the signature could not differentiate between the last two), lymphocytes, and erythrocyte progenitors. Other cells that had detectable contributions are endothelial cells and hepatocytes. Qualitatively, these cells represent most of the leading cell types in cellular turnover, as shown in Sender & Milo 2021 (Sender and Milo 2021). Epithelial cells of the gastrointestinal tract, lung, kidney, bladder, and skin are other cell types that significantly contribute to cellular turnover. Dying cells in these tissues are shed into the gut lumen, the air spaces, the urine, or out of the skin (note that while DNA from gut, lung, and kidney epithelial cells can be found in stool, bronchoalveolar lavage, and urine, the fate of DNA from skin cells is not known). This arrangement may explain why DNA from these cell types is not represented in plasma cfDNA in healthy conditions. Therefore, it appears that cells with high cfDNA plasma levels are those with relatively high turnover that are not being shed out of the body.”

      “A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3x104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. Neutrophils are another high-turnover cell type with a low level of cfDNA. When contemplating the process of NETosis (Vorobjeva and Chernyak 2020), the existence of cfDNA originating from live neutrophils would potentially diminish the concentration of cfDNA released by dying neutrophils, thereby amplifying the observed ratio for this particular cell type. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.”

      2) Effect of biofluid storage.

      Cell-free DNA continues to degrade after it is extracted via blood draw. This is not expected to change tissue of origin predictions (although that remains to be shown in the literature), but definitely affects extraction yield. This is not accounted for (or even discussed) in the manuscript. It would be important to understand how this was done for the data presented here.

      The paper integrates data from multiple recent studies that adhered to state-of-the-art procedures requiring rapid processing of blood samples. In fact, earlier studies that were not careful to isolate plasma quickly typically reported very high concentrations due to the lysis of leukocytes and artifactual release of genomic DNA. Rapid plasma isolation and DNA extraction typically yield 5ng/ml in healthy donors, as stated in the paper (last paragraph of Results).

      3) Batch effects

      Batch effects are not discussed here and can affect cfDNA yields.

      Our analysis relies on data reported by multiple studies from different groups, which independently results in similar key findings (total concentration of cfDNA and the relative contribution of different tissues). Thus, batch effects are unlikely to affect the calculations markedly.

      4) Cell-free DNA extraction kits

      Different kits and methods extract cell-free DNA at different quantities. Importantly, much research has been done recently that most kits are not sensitive for ultrashort cell-free DNA (of lengths ~50bp). This may represent most of the DNA present in plasma. This raises an important question: are the yields that are being used in Moss et al (where I presume the total concentration is taken from) accurate? Is there more cell-free DNA that was missed? While the importance of this ultrashort cfDNA has yet to be shown, it is in the blood. Thus, the authors' model may underestimate ratios by not accounting for this. This is mentioned in the discussion, but it is not evident why it was not added into the model.

      The Qiagen cfDNA extraction kit can detect 50bp fragments. As shown in the specification sheets of the kit (https://www.qiagen.com/us/products/diagnostics-and-clinical-research/solutions-for -laboratory-developed-tests/qiasymphony-dsp-circulating-dna-kit), urine DNA contains abundant DNA fragments that peak at 50bp. In contrast, plasma cfDNA does not contain such fragments at appreciable concentrations. This suggests that small fragments, 50-150bp long, are not a major component of cfDNA, and thus, our measurements of the total concentration of cfDNA are not dramatically underestimated.

      The convention regarding the size distribution of cfDNA fragments is based on extensive evidence using multiple approaches. For example, a study that profiled the DNA released by multiple cell lines in vitro (Aucamp et al. 2017) used another kit for DNA isolation – the NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Düren, Germany). This kit does extract fragments that are 50bp long (nucleospin-gel-and-pcr-clean-up-mini). Indeed, the DNA released from cultured cells did contain a peak at 50bp, but it was minor compared with the nucleosome-size peak.

      More recently, several studies did suggest the presence of ultra-short cfDNA fragments, 50 bp long on average, and concluded that such fragments might be present at a molar concentration that is comparable to that of nucleosome-protected DNA (for example, (Hisano et al. 2021)).

      Thus, our model estimates can be off by up to 2-fold (that is, actual cfDNA concentration measured in most studies overlooks the small fragments and thus underestimates the actual concentration of cfDNA by 2-fold). This is incorporated into the revised manuscript.

      We note that we cannot exclude the presence of abundant ultra-short DNA fragments (e.g., 10bp long). However, such fragments are not measurable in cfDNA analysis. Thus, we can refine our conclusion and state that only a small fraction of DNA of dying cells appears as measured cfDNA. We included a section in the methods detailing the integration of a potential factor for the short fragments and revised the discussion:

      “The overall plasma cfDNA concentration was multiplied by a factor of 1.5 to accommodate for the presence of small fragments of approximately 50 base pairs of cfDNA in the plasma. These fragments are suggested to contribute comparable molar concentrations (Hisano, Ito, and Miura 2021). Despite having approximately one-third of the mass, it is reasonable to presume that these fragments represent a similar number of genomes. This assumption is based on the idea that their source is a broken nucleosome unit, and the fragments represent the portion that was not degraded. Given the restricted data and its interpretation, we consider factors spanning the range of 1 (negligible effect) and 2 (doubling of the amount). The chosen factor, 1.5, is selected as the midpoint within this range of uncertainty.”

      “In this study, we report a surprising, dramatic discrepancy between the measured levels of cfDNA in the plasma and the potential DNA flux from dying cells. One hypothetical explanation for that discrepancy is the limited sensitivity of typical cfDNA assays to short DNA fragments, which may contribute a significant fraction of the overall cfDNA mass. Regular cfDNA analysis shows a size distribution concentrated around a length of 165 base pairs (bp). The sizes in ctDNA vary more, but most are longer than 100 bp (Alcaide et al. 2020; Udomruk et al. 2021). Recent studies suggested a significant fraction of single-strand ultrashort fragments (length of 25-60 bp) (Cheng et al. 2022; Hisano, Ito, and Miura 2021). However, the total amount of DNA contained in these fragments is less than or comparable to that of the longer “regular” nucleosome-protected cfDNA fragments (Cheng et al. 2022; Hisano, Ito, and Miura 2021), arguing against ultrashort fragments as a dominant explanation for the “missing” cfDNA material. We integrated the estimate provided by Hisano et al. into our analysis as a modifying factor for both the total concentration and uncertainty of plasma cfDNA. Importantly, this incorporation did not alter the overall conclusions, as the discrepancy between the cfDNA plasma concentration and potential DNA flux remains on the same order of magnitude. We note that we cannot exclude the presence of abundant DNA fragments that are even shorter (e.g., 10bp long) and are not measurable in cfDNA analysis. Thus, our formal conclusion is that only a small fraction of the DNA of dying cells appears as measurable cfDNA.”

      5) Health status of samples analyzed.

      Health, sex and physical activity affects cfDNA yields. This is not accounted for or discussed in the manuscript.

      We incorporated several enhancements to improve our analysis in response to the provided feedback. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      Furthermore, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in the works of (Meddeb et al. 2019; Madsen et al. 2019). Our intent in doing so was to demonstrate that these factors are unlikely to alter our conclusions substantially when considering a healthy population. We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion:

      “Our estimates for total plasma cfDNA concentration were derived from the median concentration observed in individuals below 47 years of age (n=52), as reported by (Meddeb et al. 2019). To complement this, we integrated our total concentration estimates with data on the proportion of cfDNA originating from specific cell types, leveraging a plasma methylome deconvolution method described by (Loyfer et al. 2023), which did not provide absolute quantities of cfDNA). To quantify the uncertainty associated with our cfDNA concentration estimates, we employed a methodology that considered several sources of variation. First, we incorporated the confidence interval of the median concentration reported by Meddeb et al. as a measure of uncertainty. Additionally, we accounted for individual-specific and analytic variations based on the study by (Madsen et al. 2019), encompassing factors such as the precise timing of measurements and assay precision. These sources of uncertainty were combined using the approach outlined below.”

      “Our current analysis focused on estimating plasma cfDNA concentration and cellular turnover in a cohort of healthy, relatively young individuals. The total plasma cfDNA concentrations were sourced from healthy individuals below 47 years, as reported by (Meddeb et al. 2019). We use data analyzed based on plasma samples from healthy individuals to estimate the proportion of cfDNA originating from specific cell types (Loyfer et al. 2023). These values were then compared to the potential DNA flux resulting from homeostatic cellular turnover, estimated for reference healthy males aged between 20 and 30 (Sender and Milo 2021). In our analysis, we considered various sources of uncertainty, including inter-individual variation, variability in the timing of sample collection, and analytical precision (Madsen et al. 2019; Meddeb et al. 2019). These factors collectively contributed to an uncertainty factor of less than 3. Importantly, this level of uncertainty does not alter our conclusion regarding the relatively small fraction of DNA present in plasma as cfDNA. Furthermore, we acknowledge that age and sex can impact total cfDNA concentration, as demonstrated by (Meddeb et al. 2019), with potential variations of up to 30%. However, as the results of our analysis present a much larger difference, these effects do not change the conclusions drawn from our analysis. Nevertheless, age and health status may influence the proportion of cfDNA originating from specific cell types and their corresponding cellular turnover rates. Consequently, the ratios themselves may vary in the elderly population or individuals with underlying health conditions.”

      Reviewer #2 (Recommendations For The Authors):

      1) Align the cohorts to estimate DNA production and plasma cfDNA levels. Cellular turnover rate and plasma cfDNA levels vary with age, sex, circadian clock, and other factors (Madsen AT et al, EBioMedicine, 2019). This study estimated DNA production using data abstracted from a homogenous group of healthy control males (Sender & Milo, Nat Med 2021). On the other hand, plasma cfDNA levels were obtained from datasets of more diverse cohort of healthy males and females with a wide range of ages (Loyfer et al. Nature, 2023 and Moss et al., Nat Commun, 2018).

      We have incorporated several enhancements to improve the coherence of our analysis. In our revised examination, we drew upon the total plasma concentration of cfDNA, as documented in a study conducted by (Meddeb et al. 2019), while considering the influence of age and sex on these concentrations. To ensure the cohort's alignment, we focus on relatively young and healthy individuals, specifically those below the age of 47. This approach allowed for a more meaningful comparison with the estimated DNA flux from a reference male human aged between 20 and 30 years.

      There was no specific estimate for a cohort of young males in both Meddeb et al. and Loyfer et al.; however, we factored in the expected variations stemming from sex, age, and other relevant factors, as elucidated in literature (Meddeb et al. 2019; Madsen et al. 2019). Thus, we demonstrate that sex and age have a small effect on the cfDNA concentrations and thus are unlikely to alter our conclusions substantially when considering a healthy population.

      We summarize the changes in the first paragraph, replacing the “Tissue-specific cfDNA concentration” subsection of the method, and the fourth paragraph added to the discussion.

      “Our estimates for total plasma cfDNA concentration were derived from the median concentration observed in individuals below 47 years of age (n=52), as reported by (Meddeb et al. 2019). To complement this, we integrated our total concentration estimates with data on the proportion of cfDNA originating from specific cell types, leveraging a plasma methylome deconvolution method described by (Loyfer et al. 2023), which did not provide absolute quantities of cfDNA). To quantify the uncertainty associated with our cfDNA concentration estimates, we employed a methodology that considered several sources of variation. First, we incorporated the confidence interval of the median concentration reported by Meddeb et al. as a measure of uncertainty. Additionally, we accounted for individual-specific and analytic variations based on the study by (Madsen et al. 2019), encompassing factors such as the precise timing of measurements and assay precision. These sources of uncertainty were combined using the approach outlined below.”

      “Our current analysis focused on estimating plasma cfDNA concentration and cellular turnover in a cohort of healthy, relatively young individuals. The total plasma cfDNA concentrations were sourced from healthy individuals below 47 years, as reported by (Meddeb et al. 2019). We use data analyzed based on plasma samples from healthy individuals to estimate the proportion of cfDNA originating from specific cell types (Loyfer et al. 2023). These values were then compared to the potential DNA flux resulting from homeostatic cellular turnover, estimated for reference healthy males aged between 20 and 30 (Sender and Milo 2021). In our analysis, we considered various sources of uncertainty, including inter-individual variation, variability in the timing of sample collection, and analytical precision (Madsen et al. 2019; Meddeb et al. 2019). These factors collectively contributed to an uncertainty factor of less than 3. Importantly, this level of uncertainty does not alter our conclusion regarding the relatively small fraction of DNA present in plasma as cfDNA. Furthermore, we acknowledge that age and sex can impact total cfDNA concentration, as demonstrated by (Meddeb et al. 2019), with potential variations of up to 30%. However, as the results of our analysis present a much larger difference, these effects do not change the conclusions drawn from our analysis. Nevertheless, age and health status may influence the proportion of cfDNA originating from specific cell types and their corresponding cellular turnover rates. Consequently, the ratios themselves may vary in the elderly population or individuals with underlying health conditions.”

      2) "cfDNA fragments are not created equal". Recent studies demonstrate that cfDNA composition vary with disease state. For example, cfDNA GC content, fraction of short fragments, and composition of some genomic elements increase in heart transplant rejection compared to no-rejection state (Agbor-Enoh, Circulation, 2021). The genomic location and disease state may therefore be important factors to consider in these analyses.

      In this study, we addressed the total amount of cfDNA in healthy individuals without regard to GC content, representation of different genomic regions, or fragment length, as the goal was to understand if cell death rates are fully accounted for by cfDNA concentration. We agree that it will be interesting to study the relative representation of the genome in cfDNA and the processes that determine cfDNA concentration in pathologies beyond the rate of cell death. These topics for future research fall beyond this study's scope.

      3) Alternative sources of DNA production should be considered. Aside from cell death, DNA can be released from cells via active secretion. This and other additional sources of DNA should be considered in future studies. The distinct characteristics of mitochondrial DNA to genomic DNA should also be considered.

      We know only a few specific cases whereby DNA is released from cells that are not dying. These include the release of DNA from erythroblasts and megakaryocytes to generate anucleated erythrocytes and platelets (Moss et al. 2022, cited in our paper) and the release of NETs from neutrophils.

      The presence of cfDNA fragments originating from megakaryocytes and erythroblasts indicates the elimination of megakaryocytes and erythroblasts and the birth of erythrocytes and platelets. However, the considerations in the rest of the paper still apply: the concentration of cfDNA from these sources is far lower than expected from the cell turnover rate.

      Concerning NETosis: the presence of cfDNA originating in neutrophils that have not died would reduce the concentration of cfDNA from dying neutrophils and thus further increase the discrepancy, which is the topic of our study (under-representation of DNA from dying cells in plasma).

      We updated a paragraph in the discussion regarding this issue:

      “A comparison between the different types of cells shows a trend in which less DNA flux from cells with higher turnover gets to the bloodstream. In particular, a tiny fraction (1 in 3x104) of DNA from erythroid progenitors arrives at the plasma, indicating an extreme efficiency of the DNA recovery mechanism. Erythroid progenitors are arranged in erythroblastic islands. Up to a few tens of erythroid progenitors surround a single macrophage that collects the nuclei extruded during the erythrocyte maturation process (pyrenocytes) (Chasis and Mohandas 2008). The amount of DNA discarded through the maturation of over 200 billion erythrocytes per day (Sender and Milo 2021) exceeds all other sources of homeostatic discarded DNA. Our findings indicate that the organization of dedicated erythroblastic islands functions highly efficiently regarding DNA utilization. Neutrophils are another high-turnover cell type with a low level of cfDNA. When contemplating the process of NETosis (Vorobjeva and Chernyak 2020), the existence of cfDNA originating from live neutrophils would potentially diminish the concentration of cfDNA released by dying neutrophils, thereby amplifying the observed ratio for this particular cell type. The overall trend of higher turnover resulting in a lower cfDNA to DNA flux ratio may indicate similar design principles, in which the utilization of DNA is better in tissues with higher turnover. However, our analysis is limited to only several cell types (due to cfDNA test and deconvolution sensitivities), and extrapolation to cells with lower cell turnover is problematic.”

      We neglected mitochondrial DNA, as it is not measured in methylation cell-of-origin analysis. Similarly to the argument above, if some of the total DNA measured in plasma is in fact mitochondrial, this would mean that genomic cfDNA concentration is actually lower than the estimates, meaning that an even smaller fraction of DNA from dying cells is measured in plasma.

    2. Joint Public Review:

      Summary<br /> Sender et al describe a model to estimate what fraction of DNA becomes cell-free DNA in plasma. This is of great interest to the community, as the amount of DNA from a certain tissue (for example, a tumor) that becomes available for detection in the blood has important implications for disease detection.

      Strengths<br /> The question asked by the authors has potentially important implications for disease diagnosis. Understanding how genomic DNA degrades in the human circulation can guide towards ways to enrich for DNA of interest or may lead to unexpected methods of conserving cell-free DNA. Thus, the question "how much genomic DNA becomes cfDNA" is of great interest to the scientific and medical community. I believe this manuscript has the potential to be a widely used resource. As more data is collected on cell-free DNA yields and cellular turnover in the body, this work will only increase in importance.

      Appraisal<br /> At this stage of the manuscript (second submission), I think the authors provide important evidence and analysis that aim to answer their research question. Previous concerns about methodology have been addressed.

      Impact<br /> This manuscript will be highly impactful on the community. The field of liquid biopsies (non-invasive diagnostics) has the potential to revolutionize the medical field (and has already in certain areas, such as prenatal diagnostics). Yet, there is a lack of basic science questions in the field. This manuscript is an important step forward in asking more "basic science" questions that seek to answer a fundamental biological question.

    1. eLife assessment

      The current manuscript presents a cryo-EM structure of a tripartite ATP-independent periplasmic (TRAP) transporter that contributes to Haemophilus influenzae virulence. Convincing biophysical and cryo-EM experiments yield a valuable molecular model, but the functional importance of some of the molecular features identified remains to be demonstrated.

    1. eLife assessment

      This valuable paper compares blood gene signature responses between small cohorts of individuals with mild and severe COVID-19 and claims that an early innate immune response mediated via NK cells leads to less severe infection, more rapid viral clearance, and Th1/2 differentiation. The evidence supporting the conclusions is solid based on the use of appropriate and comprehensive assays and analysis tools, but not definitive based on mismatched timing of samples between the two cohorts coupled with small cohort size.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Medina et al, 2023 investigated the peripheral blood transcriptional responses in patients with diversifying disease outcomes. The authors characterized the blood transcriptome of four non-hospitalized individuals presenting mild disease and four patients hospitalized with severe disease. These individuals were observed longitudinally at three time points (0-, 7-, and 28-days post recruitment), and distinct transcriptional responses were observed between severe hospitalized patients and mild non-hospitalized individuals, especially during 0- and 7-day collection time points. Particularly, the authors found that increased expression of genes associated with NK cell cytotoxicity is associated with mild outcomes. Additional co-regulated gene network analyses positively correlate T cell activity with mild disease and neutrophil degranulation with severe disease.

      Strengths:<br /> The longitudinal measurements in individual participants at consistent collection intervals can offer an added dimension to the dataset that involves temporal trajectories of genes associated with disease outcomes and is a key strength of the study. The use of co-expressed gene networks specific to the cohort to complement enrichment results obtained from pre-determined genesets can offer valuable insights into new associations/networks associated with disease progression and warrants further analyses on the biological functions enriched within these co-expressed network modules.

      Weaknesses:<br /> There is a large difference in terms of infection timeline (onset of symptom to recruitment) between mild and severe patient cohorts. As immune responses during early infection can be highly dynamic, the differences in infection timeline may contribute to differences in transcriptional signatures. The study is also limited by a small cohort size.

    3. Reviewer #2 (Public Review):

      In their manuscript, Medina and colleagues investigate transcriptional differences between mild and severe SARS-CoV-2 infections. Their analyses are very comprehensive incorporating a multitude of bioinformatics tools ranging from PCA plots, GSEA and DEG analysis, protein-protein interaction network, and weighted correlation network analyses. They conclude that in mild COVID-19 infection NK cell functionality is compromised and this is connected to cytokine interactions and Th1/Th2 cell differentiation pathways cross-talk, bridging the innate and the adaptive arms of the immune system.

      The authors successfully recruited participants with both mild and severe COVID-19 between November 2020 to May 2021. The analyzed cohort is gender and acceptably age-matched and the results reported are promising. Signatures associated with NK cell cytotoxicity in mild and neutrophil functions in the severe group during acute infection are the chief findings reported in this manuscript.

    4. Reviewer #3 (Public Review):

      Summary:<br /> Medina and colleagues explored transcriptional kinetics during SARS-CoV-2 between non-hospitalized and hospitalized cohorts and identified that early NK signaling may be responsible for less severe disease.

      Strengths:<br /> The paper includes extremely detailed analyses and makes an interesting attempt to link innate and adaptive responses. The analyses are appropriate for the data and described in clear language. The inclusion of late time points is interesting and potentially relevant to long COVID studies. Most findings were compatible with other detailed immune mapping during severe COVID-19.

      Weaknesses:<br /> 1. The authors claim to be looking at the earliest stages of infection but this is not true as all patients enrolled are already symptomatic. The time points selected are unlikely to be useful clinically for biomarker selection as they are too late, and are likely beyond the point when the immune responses between severe and mild infection start to diverge.<br /> 2. The comparator timepoints between mild and severe cases do not match. The most comparison would be between day 7 of mild versus day 0 of severe which is already fairly late during infection.<br /> 3. The authors mention viral clearance but I see no evidence of viral loads measured in these individuals.<br /> 4. The cohort is quite small to draw definitive conclusions.<br /> 5. It is uncertain whether the results are applicable to current conditions as most infected people are immune experienced.<br /> 6. I found the discussion to be a bit too detailed and dense. I would suggest editing to make it more streamlined.

    1. eLife assessment

      This valuable study demonstrates that there is significant variation in the susceptibility of isoniazid-resistant Mycobacterium tuberculosis clinical isolates to killing by rifampicin, in some cases at the same tolerance levels as bona fide resistant strains. The evidence provided is solid, with no clear genetic marker for increased tolerance, suggesting that there may be multiple routes to achieving this phenotype. The work will be of interest to infectious disease researchers.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The study entitled "Rifampicin tolerance and growth fitness among isoniazid-resistant clinical Mycobacterium tuberculosis isolates: an in-vitro longitudinal study" by Vijay et al. provides valuable insights into the association of rifampicin tolerance and growth fitness with isoniazid resistance among clinical isolates of M. tuberculosis. Antibiotic tolerance in M. tuberculosis is an important topic since it contributes to the lengthy and complicated treatment required to cure tuberculosis disease and may portend the emergence of antibiotic resistance. The authors found that rifampicin tolerance was correlated with bacterial growth, rifampicin minimum inhibitory concentrations, and isoniazid-resistance mutations.

      Strengths:<br /> The large number of clinical isolates evaluated and their longitudinal nature during treatment for TB (including exposure to rifampin) are strengths of the study.

      Weaknesses:<br /> Some of the methodologies are not well explained or justified and the association of antibiotic tolerance with growth rate is not a novel finding. In addition, the molecular mechanisms underlying rifampicin tolerance only in rapidly growing isoniazid-resistant isolates have not been elucidated and the potential implications of these findings for clinical management are not immediately apparent.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This study by Vijay and colleagues addresses a clinically important, and often overlooked aspect of Tb treatment. Detecting for variations in the level of antibiotic tolerance amongst otherwise antibiotic-susceptible isolates is difficult to routinely screen for, and consequently not performed. The authors, present a convincing argument that indeed, there is significant variation in the susceptibility of isoniazid-resistant strains to killing by rifampicin, in some cases at the same tolerance levels as bona fide resistant strains. On the whole, the study is easy to follow and the results are justified. This work should be of interest to the wider TB community at both a clinical and basic level.

      Weaknesses:<br /> The manuscript is long, repetitive in places, and the figures could use some amending to improve clarity (this could be a me-specific issue as they look ok on my screen, yet the colour is poor when printed).

      It would have been great to have seen some correlation between increased rifampicin tolerance and treatment outcome, although I'm not sure if this data is available to the researchers. I agree with the researchers the use of a single media condition is a limitation. However, this is true of a lot of studies.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors have initiated studies to understand the molecular mechanisms underlying the devolvement of multi-drug resistance in clinical Mtb strains. They demonstrate the association of isoniazid-resistant isolates by rifampicin treatment supporting the idea that selection of MDR is a microenvironment phenomenon and involves a group of isolates.

      Strengths:<br /> The methods used in this study are robust and the results support the authors' claims to a major extent.

      Weaknesses:<br /> The manuscript needs a thorough vetting of the language. At present, the language makes it very difficult to comprehend the methodology and results.

    1. eLife assessment

      This valuable study demonstrates that the Chitinase 3-like protein 1 (Chi3l1) interacts with gut microbiota and protects animals from intestinal injury in laboratory colitis model. The evidence supporting the claims of the authors is considered incomplete. The inclusion of consistent in vivo and in vitro data would have strengthened the study. The work will be of interest to scientists studying crosstalk between gut microbiota and inflammatory diseases.

    2. Reviewer #1 (Public Review):

      The manuscript by Chen et al. investigated the interaction between CHI3L1, a chitinase-like protein in the 18 glycosyl hydrolase family, and gut bacteria in the mucosal layers. The authors provided evidence to document the direct interaction between CHI3L1 and peptidoglycan, a major component of bacterial cell walls. In doing so, Chi3l1 produced by gut epithelial cells regulates the balance of the gut microbiome and diminishes DSS-induced colitis, potentially through the colonization of protective gram-positive bacteria such as lactobacillus.

      The study is the first to systemically document the interactions between Chi3L1 and microbiome. Convincing data were shown to characterize the imbalance of gram-positive bacteria in the newly generated gut epithelial-specific Chi3L1 deficient mice. Comprehensive FMT experiments were performed to demonstrate the contributions of gut microbiome using the mouse colitis model. However, the manuscript could've been strengthened by additional mechanistic studies concerning the binding between Chi3l1 and peptidoglycan, and how this interaction could facilitate the colonization of gram-positive bacteria. Additionally, the conclusion by the authors that disordered intestinal bacteria in gut epithelial-specific Chi3L1 deficient mice, rather than an effect by host cells, contributes to exacerbated colitis, needs further validation. In fact, the fact that FMT did not completely rescue the phenotype may point to the role of host cells in the processes. On the contrary, there is an existing body of literature demonstrating the detrimental roles of Chi3l1 in the mouse IBD model, conflicting with the current study. The differences in study design and approaches in these studies that lead to controversial findings will need to be discussed.

      Specifically,<br /> 1) In Figure 1, it is curious that the authors only chose E.coli and staphytlococcus sciuri to test the induction of Chi3l1. What about other bacteria? Why does only E.coli but not staphytlococcus sciuri induce chi3l1 production? It does not prove that the gut microbiome induces the expression of Chi3l1. If it is the effect of LPS, does it trigger a cell death response or inflammatory responses that are known to induce chi3l1 production? What is the role of peptidoglycan in this experiment? Also, it is recommended to change WT to SPF in the figure and text, as no genetic manipulation was involved in this figure.

      2) In Figure 2, the binding between Chi3l1 and PGN needs better characterization, regarding the affinity and how it compares with the binding between Chi3l1 and chitin. More importantly, it is unclear how this interaction could facilitate the colonization of gram-positive bacteria.

      3) In Figure 3, the abundance of furmicutes and other gram-positive species is lower in the knockout mice. What is the rationale for choosing lactobacillus in the following transfer experiments?

      4) FDAA-labeled E. faecalis colonization is decreased in the knockouts. Is it specific for E. faecalis, or it is generally true for all gram-positive bacteria? What about the colonization of gram-negative bacteria?

      5) In Figure 5, the fact that FMT did not completely rescue the phenotype may point to the role of host cells in the processes. The reason that lactobacillus transfer did completely rescue the phenotypes could be due to the overwhelming protective role of lactobacillus itself, as the experiments were missing villin-cre mice transferred with lactobacillus.

      6) Conflicting literature demonstrating the detrimental roles of Chi3l1 in mouse IBD model needs to be acknowledged and discussed.

    3. Reviewer #2 (Public Review):

      Chen et al. investigated the regulatory mechanism of bacterial colonization in the intestinal mucus layer in mice and its implications for intestinal diseases. They demonstrated that Chi3l1 is a protein produced and secreted by intestinal epithelial cells into the mucus layer upon response to the gut microbiota, which has a turnover effect on facilitating the colonization of gram-positive bacteria in the mucosa. The data also indicate that Chi3l1 interacts with the peptidoglycan of the bacteria cell wall, supporting the colonization of beneficial bacteria strains such as Lactobacillus, and that deficiency in Chi3l1 predisposes mice to colitis. The inclusion of a small but pertinent piece of human data added to solidify their findings in mice.

      Overall, the experiments performed were appropriate and well executed, but the data analysis is incomplete and needs to be extended. Also, additional experiments are necessary for clarification and stronger support for their conclusions.

      1) Images are of great quality but lack proper quantification and statistical analysis. Statements such as "substantial increase of Chi3l1 expression in SPF mice" (Fig.1A), "reduced levels of Firmicutes in the colon lumen of IECChil1" (Fig.3F), "Chil1-/- had much lower colonization of E.faecalis" (Fig.4G), or "deletion of Chi3l1 significantly reduced mucus layer thickness" (Supplemental Figure 3A-B) are subjective. Since many conclusions were based on imaging data, the authors must provide reliable measures for comparison between conditions, as long as possible, such as fluorescence intensity, area, density, etc, as well as plots and statistical analysis.

      2) In the fecal/Lactobacillus transplantation experiments, oral gavage of Lactobacillus to IECChil1 mice ameliorated the colitis phenotype, by preventing colon length reduction, weight loss, and colon inflammation. These findings seem to go against the notion that Chi3l1 is necessary for the colonization of Lactobacillus in the intestinal mucosa. The authors could speculate on how Lactobacillus administration is still beneficial in the absence of Chi3l1. Perhaps, additional data showing the localization of the orally administered bacteria in the gut of Chi3l1 deficient mice would clarify whether Lactobacillus are more successfully colonizing other regions of the gut, but not the mucus layer. Alternatively, later time points of 2% DSS challenge, after Lactobacillus transplantation, would suggest whether the gut colonization by Lactobacillus and therefore the milder colitis phenotype, is sustained for longer periods in the absence of Chi3l1.

    4. Reviewer #3 (Public Review):

      Summary:<br /> Chen et al. are addressing a fundamental question in mammalian gut biology, namely how the host controls a mutualistic host-microbiota symbiosis. The authors focus on a protein called Chitinase 3-like protein 1 (Ch3l1) and its interaction with the protective colonic mucus layer. The rationale for the study comes from previous work showing that microbial-associated molecular patterns (MAMPs) can induce Ch3l1 in vitro, but its biological functions in the colon are unknown. In this study, the authors provide evidence supporting the claim that the gut microbiota induces the expression of Ch3l1 in vivo, mainly in mucus-producing goblet cells. Insightfully, the authors note that Ch3l1, although it lacks enzyme (chitinase) activity, still binds Chitin, a glycan that has structural similarity to bacterial cell wall peptidoglycan. This leads the authors to hypothesize that Ch3l1 binds microbial cell walls, particularly those of peptidoglycan-rich Gram-positive probiotic bacteria within the mucus, to promote their retention in the colon. Using a combination of in vivo work with mice conditionally lacking Ch3l1 in gut epithelium (IEC Ch3l1 KO); microbiota profiling; imaging of host-microbiota interactions with labeled microbes; and fecal transplants, the authors provide compelling evidence that Ch3l1 is secreted into the gut mucus layer and that the presence of Ch3l1 is associated with increased levels of beneficial Gram+ bacteria, including Lactobacillus spp. In turn, using a well-characterized colitis model, the authors show that Ch3l1 is associated with protection from intestinal injury caused by Dextran Sodium Sulfate. While these studies are novel and informative, there are several issues that undermine the authors' conclusions.

      Strengths:<br /> The authors nicely link microbial induction of Ch3l1 to mucosal protection from intestinal injury. This is done through the use of germ-free and ex-germ-free studies and by comparing Ch3l1 expression in situ between them; microbial sequencing between Control and IEC Ch3l1 KO mice, and clinical and histological injury metrics between these strains. The authors convincingly demonstrate the presence of Ch3l1 in the gut mucus through imaging, and that the deletion of this protein in mice alters the microbiota by reducing the relative abundance of Gram-positive species.

      The study employs a technically diverse set of analyses to address their hypothesis, including fluorescent labelling of microbial species for add-back studies, fecal transplants to distinguish the role(s) of the microbiota vs. host in the IEC Ch3l1 KO phenotypes in the intestinal challenge models.

      Weaknesses:<br /> The claim that mucus-associated Ch3l1 controls colonization of beneficial Gram-positive species within the mucus is not conclusive. The study should take into account recent discoveries on the nature of mucus in the colon, namely its mobile fecal association and complex structure based on two distinct mucus barrier layers coming from proximal and distal parts of the colon (PMID: ). This impacts the interpretation of how and where Ch3l1 is expressed and gets into the mucus to promote colonization. It also impacts their conclusions because the authors compare fecal vs. tissue mucus, but most of the mucus would be attached to the feces. Of the mucus that was claimed to be isolated from the WT and IEC Ch3l1 KO, this was not biochemically verified. Such verification (e.g. through Western blot) would increase confidence in the data presented. Further, the study relies upon relative microbial profiling, which can mask absolute numbers, making the claim of reduced overall Gram-positive species in mice lacking Ch3l1 unproven. It would be beneficial to show more quantitative approaches (e.g. Quantitative Microbial Profiling, QMP) to provide more definitive conclusions on the impact of Ch3l1 loss on Gram+ microbes.

      Other weaknesses lie in the execution of the aims, leaving many claims incompletely substantiated. For example, much of the imaging data is challenging for the reader to interpret due to it being unfocused, too low of magnification, not including the correct control, and not comparing the same regions of tissues among different in vivo study groups. Statistical rigor could be better demonstrated, particularly when making claims based on imaging data. These are often presented as single images without any statistics (i.e. analysis of multiple images and biological replicates). These images include the LTA signal differences, FISH images, Enterococcus colonization, and mucus thickness.

    1. eLife assessment

      This important study, using three bioactive compounds as a model, demonstrates that estimating the intake of food components based on food composition databases and self-reported dietary data is highly unreliable. The authors present convincing data showing the differences in the estimated quantile of intake of three bioactive compounds between biomarker and 24-hour dietary recall with food-composition database. The work will be of broad interest to the clinical nutrition research community.

    2. Joint Public Review:

      Summary:<br /> Identifying dietary biomarkers, in particular, has become a main focus of nutrition research in the drive to develop personalized nutrition.

      The aim of this study was to determine the accuracy of using food composition databases to assess the association between dietary intake and health outcomes. The authors found that using food composition data to assess dietary intake of specific bioactives and the impact consumption has on systolic blood pressure provided vastly different outcomes depending on the method used. These findings demonstrate the difficulty in elucidating the relationship between diet and health outcomes and the need for more stringent research in the development of dietary biomarkers.

      Strengths:<br /> The primary strength of the study is the use of a large cohort in which dietary data and the measurement of three specific bioactives and blood pressure were collected on the same day. The bioactives selected have been extensively researched for their health effects. Another strength is that the authors controlled for as many variables as possible when running the simulations to get a more accurate account of how the variability in food composition can impact research findings that associate the intake of certain food components with health outcomes.

      Weaknesses:<br /> The authors address the large variability when using food composition data, e.g. the range of tea and apple intake needed to meet recommendations depending on using the mean food composition data or using the lowest reported food content, however, there is no discussion on the intake needed if the biomarker is used. So how many cups of tea are needed to reach the suggested 200 mg/day of flavan-3-ols when using biomarker data instead of the food composition data? More information should be added on the effect of using biomarker data on dietary recommendations and risk assessment.

    1. eLife assessment

      The findings in this study are useful and may have practical implications for predicting DLBCL risk subject to further validating the bioinformatics outcomes. We found the approach and data analysis solid. However, some concerns regarding the drug sensitivity prediction and the links between the selected genes for the risk scores have been raised that need to be addressed by further functional works.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Ye et al. identified a novel tumour microenvironment (TME) signature that can help to prognosticate DLBCL. They first interrogated a publicly available dataset to identify tumour purity-related genes (TPGs) and found these TPGs were associated with extracellular matrix organisation and immune response. Protein-protein interaction analysis identified hub genes that were associated with prognosis, and 3 genes (VCAN, CD3G, C1QB) were selected to construct a prognosis model. The authors attempted to validate the findings on immunohistochemistry (IHC) and showed prognostication using an IHC assay. Finally, they showed a possible prediction of drug sensitivity using the novel signature in DLBCL.

      Strengths:<br /> This study investigated both immune and non-immune TME related to tumour purity. Tumour purity has not been thoroughly investigated in DLBCL. Hence, the prognostic significance of tumour purity demonstrated in this paper brought into light another potential area of research in DLBCL. Similarly, the investigation into non-immune TME was novel and thought-provoking, as most research in DLBCL TME has mostly been in the immune microenvironment.

      The bioinformatics approach in identifying the key TPGs was well conducted, such as the GO and KEGG enrichment analysis which supported the role of these TPGs in the modulation of the microenvironment. The findings were also validated in another dataset, which increased the confidence in this model. However, it was not clear to me why the authors chose VCAN, CD3G, and C1QB out of the 9 intersection genes that they found. It would perhaps be useful to show the statistical justification in the Supplementary Results section.

      The possible translation of these findings into clinical practice by immunohistochemistry (IHC) was a useful tool to make the findings applicable in the clinical setting. However, as stated by the authors, the real-life clinical application of these findings may be more challenging as these antigens seemed to be expressed in a continuum, rather than in a discrete manner. For example, in Figure 5A, even the low VCAN status still demonstrated strong cytoplasmic staining. Similarly, in Figure 5C, it seemed to be difficult to differentiate strong from background staining. This means pre-analytical variables may affect the staining and standardisation among different laboratories may be difficult to achieve without external controls.

      Weaknesses:<br /> Though the rationale behind choosing the TPG genes and its correlation with non-immune TME was clear, the justification for investigating CD68+ macrophages, CD4+ T cells, and CD8+ T cells was not as strong. This was done in a subsection that was supposed to investigate the prognostic values of IHC staining in VCAN, CD3G, and C1QB. Hence, the analysis of the immune compartment of the TME was rather superficial. For example, it would be insufficient to correlate CD4+ and CD8T+ T cells without understanding their deeper phenotypes such as regulatory vs memory or exhausted vs activated. An attempt was made to subtype the macrophages by bioinformatics approach but it was not further investigated with IHC.

      Similarly, the investigation into drug sensitivity was only done in-silico. This investigation was adequate for hypothesis generation. However, it was not enough to substantiate the claim that TPGs can be used to predict drug sensitivity. This claim requires functional in-vitro experiments to validate the bioinformatics approach, or even correlation with clinical data when the identified drugs were used in DLBCL, for example in the ReMODL-B cohort that used bortezomib.

    3. Reviewer #2 (Public Review):

      In this study, Zhenbang Ye and colleagues investigate the links between microenvironment signatures, gene expression profiles, and prognosis in diffuse large B-cell lymphoma (DLBCL). They show that increased tumor purity (ie, a higher proportion of tumor cells relative to surrounding stromal components) is associated with a worse prognosis. They then show that three genes associated with tumor purity (VCAN, CD3G, and C1QB) correlate with patterns of immune cell infiltration and can be used to create a risk-scoring system that predicts prognosis, which can be replicated by immunohistochemistry (IHC), and response to some therapies.

      1. The two strengths of the study are its relatively large sample size (n = 190) and the strong prognostic significance of the risk-scoring system. It is worth noting that the validation of this scoring with IHC, a simple technique already routinely used for the diagnosis and classification of DLBCL, increases the potential for clinical translation. However, the correlative nature of the study limits the conclusions that can be drawn in regard to links between the risk scoring system, the tumor microenvironment, and the biology of DLBCL.

      2. The tumor microenvironment has been extensively studied in DLBCL and a prognostic implication has already been established (for instance, Steen et al., Cancer Cell, 2021). In addition, associations have already been established in non-Hodgkin lymphoma between prognosis and expression of C1QB (Rapier-Sharman et al., Journal of Bioinformatics and Systems Biology, 2022), VCAN (S. Hu et al., Blood, 2013), and CD3G (Chen et al., Medical Oncology, 2022). Nevertheless, one of the strengths and novelty aspects of the study is the combination of these 3 genes into a risk score that is also valid by immunohistochemistry (IHC), which substantially facilitates a potential clinical translation.

      3. Figures 1A-B: tumor purity is calculated using the ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data) algorithm (Yoshihara et al., Nature Communications, 2013). The ESTIMATE algorithm is based on two gene signatures ("stromal" and "immune"). It is therefore expected that tumor purity measured by the ESTIMATE algorithm will correlate with the expression of multiple genes. Importantly, C1QB is included in the stromal signature of the ESTIMATE algorithm meaning that, by definition, it will be correlated with tumor purity in that setting.

      4. Figure 2A: as established in Figure 1C, high tumor purity is associated with worse prognosis. Later in the manuscript, it is also shown that C1QB expression is associated with a worse prognosis. However, Figure 2A shows that C1QB is associated with decreased tumor purity. It therefore makes it less likely that the prognostic role of C1QB expression is related to its impact on tumor purity. The prognostic impact could be related to different patterns of immune cell infiltration, as shown later. However, the evidence presented in the study is correlative and natural and not sufficient to draw this conclusion.

      5. Figure 3G: although there is a strong prognostic implication of the risk score on prognosis, the correlation between the risk score and tumor purity is significant but not very strong (R = 0.376). It is therefore likely that other important biological factors explain the correlation between the risk score and prognosis.

      6. Figure 6: the drug sensitivity analysis includes a wide range of established and investigational drugs with varied mechanisms of action. Although the difference in sensitivity between tumors with low and high-risk scores shows statistical significance for certain drugs, the absolute difference appears small in most cases and is of unclear biological significance. In addition, even though the risk score is statistically related to drug sensitivity, there is no direct evidence that the differences in drug sensitivity are directly related to tumor purity.

    1. eLife assessment

      This valuable manuscript reports alterations in autophagy present in dopaminergic neurons differentiated from iPSCs in patients with WDR45 mutations. The authors identified compounds that improved the defects present in mutant cells by generating isogenic iPSC without the mutation and performing an automated drug screening. The methodological approaches are solid, but the claims still need to be completed: showing the effects of the identified compounds on iron-related alterations is crucial. The effects of these drugs in vivo would be a great addition to the study.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In the current study, Papandreou et al. developed an iPSC-based midbrain dopaminergic neuronal cell model of Beta-Propeller Protein-Associated Neurodegeneration (BPAN), which is caused by mutations in the WDR45 gene and is known to impair autophagy. They also noted defective autophagy and abnormal BPAN-related gene expression signatures. Further, they performed a drug screening and identified five cardiac glycosides. Treatment with these drugs effectively in improved autophagy defects and restored gene expression.

      Strengths:<br /> Seeing the autophagy defects and impaired expression of BPAN-related genes adds strength to this study. Importantly, this work shows the value of iPSC-based modeling in studying disease and finding therapeutic strategies for genetic disorders, including BPAN.

      Weaknesses:<br /> It is unclear whether these cells show iron metabolism defects and whether treatment with these drugs can ameliorate the iron metabolism phenotypes.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, the authors aim to demonstrate that cardiac glycosides restore autophagy flux in an iPSC-derived mDA neuronal model of WDR45 deficiency. They established a patient-derived induced pluripotent stem cell (iPSC)-based midbrain dopaminergic (mDA) neuronal model and performed a medium-throughput drug screen using high-content imaging-based IF analysis. Several compounds were identified to ameliorate disease-specific phenotypes in vitro.

      Strengths:<br /> This manuscript engaged in an important topic and yielded some interesting data.

      Weaknesses:<br /> This manuscript failed to provide solid evidence to support the conclusion.

    1. eLife assessment

      Overall, the reviewers found the significance of the work valuable to the field of visual neuroscience, particularly given the large data set and strength of the method used that allowed for spatial analysis of neuronal responses in macaque V1. The evidence was deemed compelling, owing in part to the consistency of responses across animals and the fitness of modeling. Ways to improve the manuscript as outlined include an expanded discussion of similar prior literature and limitations of the method used to read out neuronal responses.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Zhang et al., investigated the relationship between monocular and binocular responses of V1 superficial-layer neurons using two-photon calcium imaging. They found a strong relationship in their data: neurons that exhibited a greater preference for one eye or the other (high ocular dominance) were more likely to be suppressed under binocular stimulation, whereas neurons that are more equivalently driven by each other (low ocular dominance) were more likely to be enhanced by binocular stimulation. This result chiefly demonstrates the relationship between ocular dominance and binocular responses in V1, corroborating what has been shown previously using electrophysiological techniques but now with greater spatial resolution (albeit less temporal resolution). The binocular responses were well-fitted by a model that institutes divisive normalization between the eyes that accounts for both the suppression and enhancement phenomena observed in the subpopulation of binocular neurons. In so doing, the authors reify the importance of incorporating ocular dominance in computational models of binocular combination.

      The conclusions of this paper are mostly well supported by the data, but there are some limitations of the methodology that need to be clarified, and an expansion of how the results relate to previous work would better contextualize these important findings in the literature.

      Strengths:<br /> The two-photon imaging technique used to resolve the activity of individual neurons within intact brain tissue grants a host of advantages. Foremost, two-photon imaging confers considerably high spatial resolution. As a result, the authors were able to sample and analyze the activity from thousands of verified superficial-layer V1 neurons. The animal model used, awake macaques, is also highly relevant for the study of binocular combination. Macaques, like humans, are binocular animals, meaning they have forward-facing eyes that confer overlapping visual fields. Importantly, macaque V1 is organized into cortical columns that process specific visual features from the separate eyes just like in humans. In combination with a powerful imaging technique, this allowed the authors to evaluate the monocular and binocular response profiles of V1 neurons that are situated within neighboring ocular dominance columns, a novel feat. To this aim, the approach was well-executed and should instill further confidence in the notion that V1 neurons combine monocular information in a manner that is dependent on the strength of their ocular dominance.

      Weaknesses:<br /> While two-photon imaging provides excellent spatial resolution, its temporal resolution is often lower compared to some other techniques, such as electrophysiology. This limits the ability to study the fast dynamics of neuronal activity, a well-understood trade-off of the method. The issue is more so that the authors draw comparisons to electrophysiological studies without explicit appreciation of the temporal difference between these techniques. In a similar vein, two-photon imaging is limited spatially in terms of cortical depth, preferentially sampling from neurons in layers 2/3. This limitation does not invalidate any of the interpretations but should be considered by readers, especially when making comparisons to previous electrophysiological reports using microelectrode linear arrays that sample from all cortical layers. Indeed, it is likely that a complete picture of early cortical binocular processing will require high spatial resolution (i.e., sampling from neurons in neighboring ocular dominance columns, from pia mater to white matter) at the biophysically relevant timescales (1ms resolution, capturing response dynamics over the full duration of the stimulus presentation, including the transient onset and steady-state periods).

    3. Reviewer #2 (Public Review):

      Summary:<br /> This study examines the pattern of responses produced by the combination of left-eye and right-eye signals in V1. For this, they used calcium imaging of neurons in V1 of awake, fixating monkeys. They take advantage of calcium imaging, which yields large populations of neurons in each field of view. With their data set, they observe how response magnitude relates to ocular dominance across the entire population. They analyze carefully how the relationship changed as the visual stimulus switched from contra-eye only, ipsi-eye only, and binocular. As expected, the contra-eye-dominated neurons responded strongly with a contra-eye-only stimulus. The ipsi-eye-dominated neurons responded strongly with an ipsi-eye-only stimulus. The surprise was responses to a binocular stimulus. The responses were similarly weak across the entire population, regardless of each neuron's ocular dominance. They conclude that this pattern of responses could be explained by interocular divisive normalization, followed by binocular summation.

      Strengths:<br /> A major strength of this work is that the model-fitting was done on a large population of simultaneously recorded neurons. This approach is an advancement over previous work, which did model-fitting on individual neurons. The fitted model in the manuscript represents the pattern observed across the large population in V1, and washes out any particular property of individual neurons. Given the large neuronal population from which the conclusion was drawn, the authors provide solid evidence supporting their conclusion. They also observed consistency across 5 fields of view.

      The experiments were designed and executed appropriately to test their hypothesis. Their data support their conclusion.

      Weaknesses:<br /> One weakness of their study is that calcium signals can exaggerate the nonlinear properties of neurons. Calcium imaging renders poor responses poorer and strong responses stronger, compared to single-unit recording. In particular, the dramatic change in the population response between monocular stimulation and binocular stimulation could actually be less pronounced when measured with single-unit recording methods. This means their choice of recording method could have accidentally exaggerated the evidence of their finding.

      The implication of their finding is that strong ocular dominance is the result of release from interocular suppression by a monocular stimulus, rather than the lack of binocular combination as many traditional studies have assumed. This could significantly advance our understanding of the binocular combination circuitry of V1. The entire population of neurons could be part of a binocular combination circuitry present in V1.

    4. Reviewer #3 (Public Review):

      The authors have made simultaneous recordings of the responses of large numbers of neurons from the primary visual cortex using optical two-photon imaging of calcium signals from the superficial layers of the cortex. Recordings were made to compare the responses of the cortical neurons under normal binocular viewing of a flat screen with both eyes open and monocular viewing of the same screen with one eye's view blocked by a translucent filter. The screen displayed visual stimuli comprising small contrast patches of Gabor function distributions of luminance, a stimulus that is known to excite cortical neurons.

      This is an important data set, given the large numbers of neurons recorded. The authors present a simple model to explain the binocular combination of neuronal signals from the right and left eyes.

      The limitations of the paper as written are as follows. These points can be addressed with some additional analysis and rewriting of sections of the paper. No new experimental data need to be collected.

      1) The authors should acknowledge the fact that these recordings arise from neurons in the superficial layers of the cortex. This limitation arises from the usual constraints on optical imaging in the macaque cortex. This means that the sample of neurons forming this data set is not fully representative of the population of binocular neurons within the visual cortex. This limitation is important in comparing the outcome of these experiments with the results from other studies of binocular combination, which have used single-electrode recording. Electrode recording will result in a sample of neurons that is drawn from many layers of the cortex, rather than just the superficial layers.

      2) Single-neuron recording of binocular neurons in the primary visual cortex has shown that these neurons often have some spontaneous activity. Assessment of this spontaneous level of firing is important for accurate model fitting [1]. The paper here should discuss the level of spontaneous neuronal firing and its potential significance.

      3) The arrangements for visual stimulation and comparison of binocular and monocular responses mean that the stereoscopic disparity of the binocular stimuli is always at zero or close to zero. The animal's fixation point is in the centre of a single display that is viewed binocularly. The fixation point is, by definition, at zero disparity. The other points on the flat display are also at zero disparity or very close to zero because they lie in the same depth plane. There will be some small deviations from exactly zero because the geometry of the viewing arrangements results in the extremities of the display being at a slightly different distance than the centre. Therefore, the visual stimulation used to test the binocular condition is always at zero disparity, with a slight deviation from zero at the edges of the display, and never changes. [There is a detail that can be ignored. The experimenters tested neurons with visual stimulation at different real distances from the eyes, but this is not relevant here. Provided the animals accurately converged their eyes on the provided binocular fixation point, then the disparity of the visual stimuli will always be at or close to zero, regardless of viewing distance in these circumstances.] However, we already know from earlier work that neurons in the visual cortex exhibit a range of selectivity for binocular disparity. Some neurons have their peak response at non-zero disparities, representing binocular depths nearer than the fixation depth or beyond it. The response of other neurons is maximally suppressed by disparities at the depth of the fixation point (so-called Tuned Inhibitory [TI] neurons). The simple model and analysis presented in the paper for the summation of monocular responses to predict binocular responses will perform adequately for neurons that are tuned to zero disparity, so-called tuned excitatory neurons [TE], but is necessarily compromised when applied to neurons that have other, different tuning profiles. Specifically, when neurons are stimulated binocularly with a non-preferred disparity, the binocular response may be lower than the monocular response[2, 3]. This more realistic view of binocular responses needs to be considered by the authors and integrated into their modelling.

      4) The data in the paper show some features that have been reported before but are not captured by the model. Notably for neurons with extreme values of ocular dominance, the binocular response is typically less than the larger of the two monocular responses. This is apparent in the row of plots in Figure 2D from individual animals and in the pooled data in Figure 2E. Responses of this type are characteristic of tuned inhibitory [TI] neurons[2]. It is not immediately clear why this feature of the data does not appear in the summary and analysis in Figure 3. The paper text states that the responses were "first normalized by the median of the binocular responses". This will certainly get rid of this characteristic of the data, but this step needs better justification, or an amendment to the main analysis is needed. In the present form, the model and analysis do not appear to fit the data in Figure 2 as accurately as needed. The authors should address the discrepancy between the data as presented in Figures 2D, E, and Figure 3.

      Citations<br /> 1. Prince, S.J.D., Pointon, A.D., Cumming, B.G., and Parker, A.J., (2002). Quantitative analysis of the responses of V1 neurons to horizontal disparity in dynamic random-dot stereograms. Journal of Neurophysiology, 87: 191-208.<br /> 2. Prince, S.J.D., Cumming, B.G., and Parker, A.J., (2002). Range and mechanism of encoding of horizontal disparity in macaque V1. Journal of Neurophysiology, 87: 209-221.<br /> 3. Poggio, G.F. and Fischer, B., (1977). Binocular interaction and depth sensitivity in striate and prestriate cortex of behaving rhesus monkey. Journal of Neurophysiology, 40: 1392-1405 doi 10.1152/jn.1977.40.6.1392.

    1. eLife assessment

      These are valuable findings that support a link between low-dimensional brain network organisation, patterns of ongoing thought, and trait-level personality factors, making it relevant for researchers in the field of spontaneous cognition, personality, and neuropsychiatry. While this link is not entirely new, the paper brings to bear a rich dataset and a well-conducted study, to approach this question in a novel way. The evidence in support of the findings is convincing.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors ran an explorative analysis in order to describe how a "tri-partite" brain network model could describe the combination of resting fMRI data and individual characteristics. They utilized previously obtained fMRI data across four scanning runs in 144 individuals. At the end of each run, participants rated their patterns of thinking on 12 statements (short multi-dimensional experience sampling-MDES) using a 0-100% visual analog scale. Also, 71 personality traits were obtained on 21 questionnaires. The authors ran two separate principal component analyses (PCA) to obtain low dimensional summaries of the two individual characteristics (personality traits from questionnaires, and thought patterns from MDES). The dimensionality reduction of the fMRI data was done by means of gradient analysis, which was combined with Neurosynth decoding to visualize the functional axis of the gradients. To test the reliability of thought components across scanning time, intra-class correlation coefficients (ICC) were calculated for the thought patterns, and discriminability indices were calculated for whole gradients. The relationship between individual differences in traits, thoughts, and macro-scale gradients was tested with multivariate regression.

      The authors found: a) reliability of thought components across the one hour of scanning, b) Gradient 1 differentiated between visual regions and DMN, Gradient 2 dissociated somatomotor from visual cortices, Gradient 3 differentiated the DMN from the fronto-parietal system, c) the associations between traits/thought patterns and brain gradients revealed significant effects of "introversion" and "specific internal" thought: "Introversion" was associated with variant parcels on the three gradients, with most of parcels belonging to the VAN and then to the DMN; and "Specific internal thought" was associated with variant parcels on the three gradients with most of parcels belonging to the DAN and then the visual. The authors conclude that interactions between attention systems and the DMN are important influences on ongoing thought at rest.

      Strengths:<br /> The study's strength lies in its attempt to combine brain activity with individual characteristics using state-of-the-art methodologies.

      Weaknesses:<br /> The study protocol in its current form restricts replicability. This is largely due to missing information on the MRI protocol and data preprocessing. The article refers the reader to the work of Mendes et al 2019 which is said to provide this information, but the paper should rather stand alone with all this crucial material mentioned here, as well. Also, effect sizes are provided only for the multiple multivariate regression of the inter-class correlations, which makes it difficult to appreciate the power of the other obtained results.

    3. Reviewer #2 (Public Review):

      The authors set out to draw further links between neural patterns observed at "rest" during fMRI, with their related thought content and personality traits. More specifically, they approached this with a "tri-partite network" view in mind, whereby the ventral attention network (VAN), the dorsal attention network (DAN), and the default mode network (DMN) are proposed to play a special role in ongoing conscious thought. They used a gradients approach to determine the low dimensional organisation of these networks. In concert, using PCA they reduced thought patterns captured at four time points during the scan, as well as traits captured from a large battery of questionnaires.

      The main findings were that specific thought and trait components were related to variations in the organisation of the tri-partite networks, with respect to cortical gradients.

      Strengths of the methods/results: Having a long (1 hr) resting state MRI session, which could be broken down into four separate scanning/sampling components is a strength. Importantly, the authors could show (via intra-class correlation coefficients) the similarity of thoughts and connectivity gradients across the entire session. Not only did this approach increase the richness of the data available to them, it speaks in an interesting way to the stability of these measures. The inclusion of both thought patterns during scanning along with trait-level dispositional factors is most certainly a strength, as many studies will often include either/or of these, rather than trying to reconcile across. Of the two main findings, the finding that detailed self-generated thought was associated with a decoupling of regions of DAN from regions in DMN was particularly compelling, in light of mounting literature from several fields that support this.

      Weaknesses of the methods/results: Considering the richness of the thought and personality data, I was a little surprised that only two main findings emerged (i.e., a relationship with trait introversion, and a relationship with the "specific internal" thought pattern). I wondered whether, at least in part and in relation to traits, this might stem from the large and varied set of questionnaires used to discern the traits. These questionnaires mostly comprised personality/mood, but some sampled things that do not fall into that category (e.g., musicality, internet addition, sleep), and some related directly to spontaneous thought properties (e.g., mind wandering, musical imagery). It would be interesting to see what relationships would emerge by being more selective in the traits measured, and in the tools to measure them.

      Taken together, the main findings are interesting enough. However, the real significance of this work, and its impact, lie in the richness of the approach: combing across fMRI, spontaneous thought, and trait-level factors. Triangulating these data has important potential for furthering our understanding of brain-behaviour relationship across different levels of organisation.

    1. eLife assessment

      This paper provides a valuable alternative explanation for the influence of environmental volatility on learning, attributing such effects to a mixture of strategies (MoS), rather than changes in the learning rate. The authors demonstrate that the MoS model provides a superior fit to previously published data, and suggest that atypical learning in individuals with anxiety and depression might reflect their use of a suboptimal strategy. While the approach should be of interest to researchers across decision sciences, the evidence is incomplete, limiting its potential impact.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This paper describes a reanalysis of data collected by Gagne et al. (2020), who investigated how human choice behaviour differs in response to changes in environmental volatility. Several studies to date have demonstrated that individuals appear to increase their learning rate in response to greater volatility and that this adjustment is reduced amongst individuals with anxiety and depression. The present authors challenge this view and instead describe a novel Mixture of Strategies (MOS) model, that attributes individual differences in choice behaviour to different weightings of three distinct decision-making strategies. They demonstrate that the MOS model provides a superior fit to the data and that the previously observed differences between patients and healthy controls may be explained by patients opting for a less cognitively demanding, but suboptimal, strategy.

      Strengths:<br /> The authors compare several models (including the original winning model in Gagne et al., 2020) that could feasibly fit the data. These are clearly described and are evaluated using a range of model diagnostics. The proposed MOS model appears to provide a superior fit across several tests.

      The MOS model output is easy to interpret and has good face validity. This allows for the generation of clear, testable, hypotheses, and the authors have suggested several lines of potential research based on this.

      Weaknesses:<br /> The authors justify this reanalysis by arguing that learning rate adjustment (which has previously been used to explain choice behaviour on volatility tasks) is likely to be too computationally expensive and therefore unfeasible. It is unclear how to determine how "expensive" learning rate adjustment is, and how this compares to the proposed MOS model (which also includes learning rate parameters), which combines estimates across three distinct decision-making strategies.

      As highlighted by the authors, the model is limited in its explanation of previously observed learning differences based on outcome value. It's currently unclear why there would be a change in learning across positive/negative outcome contexts, based on strategy choice alone.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Previous research shows that humans tend to adjust learning in environments where stimulus-outcome contingencies become more volatile. This learning rate adaptation is impaired in some psychiatric disorders, such as depression and anxiety. In this study, the authors reanalyze previously published data on a reversal-learning task with two volatility levels. Through a new model, they provide some evidence for an alternative explanation whereby the learning rate adaptation is driven by different decision-making strategies and not learning deficits. In particular, they propose that adjusting learning can be explained by deviations from the optimal decision-making strategy (based on maximizing expected utility) due to response stickiness or focus on reward magnitude. Furthermore, a factor related to the general psychopathology of individuals with anxiety and depression negatively correlated with the weight on the optimal strategy and response stickiness, while it correlated positively with the magnitude strategy (a strategy that ignores the probability of outcome).

      Strengths:<br /> The main strength of the study is a novel and interesting explanation of an otherwise well-established finding in human reinforcement learning. This proposal is supported by rigorously conducted parameter retrieval and the comparison of the novel model to a wide range of previously published models.

      Weaknesses:<br /> My main concern is that the winning model (MOS6) does not have an error term (inverse temperature parameter beta is fixed to 8.804).

      1) It is not clear why the beta is not estimated and how were the values presented here chosen. It is reported as being an average value but it is not clear from which parameter estimation. Furthermore, with an average value for participants that would have lower values of inverse temperature (more stochastic behaviour) the model is likely overfitting.

      2) In the absence of a noise parameter, the model will have to classify behaviour that is not explained by the optimal strategy (where participants simply did not pay attention or were not motivated) as being due to one of the other two strategies.

      3) A model comparison among models with inverse temperature and variable subsets of the three strategies (EU + MO, EU + HA) would be interesting to see. Similarly, comparison of the MOS6 model to other models where the inverse temperature parameter is fixed to 8.804).

      This is an important limitation because the same simulation as with the MOS model in Figure 3b can be achieved by a more parsimonious (but less interesting) manipulation of the inverse temperature parameter.

      Furthermore, the claim that the EU represents an optimal strategy is a bit overstated. The EU strategy is the only one of the three that assumes participants learn about the stimulus-outcomes contingencies. Higher EU strategy utilisation will include participants that are more optimal (in maximum utility maximisation terms), but also those that just learned better and completely ignored the reward magnitude.

      Other minor issues that I have are the following:<br /> The mixture strategies model is an interesting proposal, but seems to be a very convoluted way to ask: to what degree are decisions of subjects affected by reward, what they've learned, and response stickiness? It seems to me that the same set of questions could be addressed with a simpler model that would define choice decisions through a softmax with a linear combination of the difference in rewards, the difference in probabilities, and a stickiness parameter.

      Learning rate adaptation was also shown with tasks where decision-making strategies play a less important role, such as the Predictive Inference task (see for instance Nassar et al, 2010). When discussing the merit of the findings of this study on learning rate adaptation across volatility blocks, this work would be essential to mention.

    4. Reviewer #3 (Public Review):

      Summary:<br /> This paper presents a new formulation of a computational model of adaptive learning amid environmental volatility. Using a behavioral paradigm and data set made available by the authors of an earlier publication (Gagne et al., 2020), the new model is found to fit the data well. The model's structure consists of three weighted controllers that influence decisions on the basis of (1) expected utility, (2) potential outcome magnitude, and (3) habit. The model offers an interpretation of psychopathology-related individual differences in decision-making behavior in terms of differences in the relative weighting of the three controllers.

      Strengths:<br /> The newly proposed "mixture of strategies" (MOS) model is evaluated relative to the model presented in the original paper by Gagne et al., 2020 (here called the "flexible learning rate" or FLR model) and two other models. Appropriate and sophisticated methods are used for developing, parameterizing, fitting, and assessing the MOS model, and the MOS model performs well on multiple goodness-of-fit indices. The parameters of the model show decent recoverability and offer a novel interpretation for psychopathology-related individual differences. Most remarkably, the model seems to be able to account for apparent differences in behavioral learning rates between high-volatility and low-volatility conditions even with no true condition-dependent change in the parameters of its learning/decision processes. This finding calls into question a class of existing models that attribute behavioral adaptation to adaptive learning rates.

      Weaknesses:<br /> 1. Some aspects of the paper, especially in the methods section, lacked clarity or seemed to assume context that had not been presented. I found it necessary to set the paper down and read Gagne et al., 2020 in order to understand it properly.

      2. There is little examination of why the MOS model does so well in terms of model fit indices. What features of the data is it doing a better job of capturing? One thing that makes this puzzling is that the MOS and FLR models seem to have most of the same qualitative components: the FLR model has parameters for additive weighting of magnitude relative to probability (akin to the MOS model's magnitude-only strategy weight) and for an autocorrelative choice kernel (akin to the MOS model's habit strategy weight). So it's not self-evident where the MOS model's advantage is coming from.

      3. One of the paper's potentially most noteworthy findings (Figure 5) is that when the FLR model is fit to synthetic data generated by the expected utility (EU) controller with a fixed learning rate, it recovers a spurious difference in learning rate between the volatile and stable environments. Although this is potentially a significant finding, its interpretation seems uncertain for several reasons:

      - According to the relevant methods text, the result is based on a simulation of only 5 task blocks for each strategy. It would be better to repeat the simulation and recovery multiple times so that a confidence interval or error bar can be estimated and added to the figure.

      - It makes sense that learning rates recovered for the magnitude-oriented (MO) strategy are near zero, since behavior simulated by that strategy would have no reason to show any evidence of learning. But this makes it perplexing why the MO learning rate in the volatile condition is slightly positive and slightly greater than in the stable condition.

      - The pure-EU and pure-MO strategies are interpreted as being analogous to the healthy control group and the patient group, respectively. However, the actual difference in estimated EU/MO weighting between the two participant groups was much more moderate. It's unclear whether the same result would be obtained for a more empirically plausible difference in EU/MO weighting.

      - The fits of the FLR model to the simulated data "controlled all parameters except for the learning rate parameters across the two strategies" (line 522). If this means that no parameters except learning rate were allowed to differ between the fits to the pure-EU and pure-MO synthetic data sets, the models would have been prevented from fitting the difference in terms of the relative weighting of probability and magnitude, which better corresponds to the true difference between the two strategies. This could have interfered with the estimation of other parameters, such as learning rate.

      - If, after addressing all of the above, the FLR model really does recover a spurious difference in learning rate between stable and volatile blocks, it would be worth more examination of why this is happening. For example, is it because there are more opportunities to observe learning in those blocks?

      4. Figure 4C shows that the habit-only strategy is able to learn and adapt to changing contingencies, and some of the interpretive discussion emphasizes this. (For instance, line 651 says the habit strategy brings more rewards than the MO strategy.) However, the habit strategy doesn't seem to have any mechanism for learning from outcome feedback. It seems unlikely it would perform better than chance if it were the sole driver of behavior. Is it succeeding in this example because it is learning from previous decisions made by the EU strategy, or perhaps from decisions in the empirical data?

      5. For the model recovery analysis (line 567), the stated purpose is to rule out the possibility that the MOS model always wins (line 552), but the only result presented is one in which the MOS model wins. To assess whether the MOS and FLR models can be differentiated, it seems necessary also to show model recovery results for synthetic data generated by the FLR model.

      6. To the best of my understanding, the MOS model seems to implement valence-specific learning rates in a qualitatively different way from how they were implemented in Gagne et al., 2020, and other previous literature. Line 246 says there were separate learning rates for upward and downward updates to the outcome probability. That's different from using two learning rates for "better"- and "worse"-than-expected outcomes, which will depend on both the direction of the update and the valence of the outcome (reward or shock). Might this relate to why no evidence for valence-specific learning rates was found even though the original authors found such evidence in the same data set?

      7. The discussion (line 649) foregrounds the finding of greater "magnitude-only" weights with greater "general factor" psychopathology scores, concluding it reflects a shift toward simplifying heuristics. However, the picture might not be so straightforward because "habit" weights, which also reflect a simplifying heuristic, correlated negatively with the psychopathology scores.

    1. Reviewer #3 (Public Review):

      Summary<br /> Pham, Pahuja, Hagenbeek, et al. have conducted a comprehensive range of assays to biochemically and genetically determine TEAD degradation through RNF146 ubiquitination. Additionally, they designed a PROTAC protein degrader system to regulate the Hippo pathway through TEAD degradation. Overall, the data appears robust. However, the manuscript lacks detailed methodological descriptions, which should be addressed and improved. For instance, the methods used to analyze the K48 ubiquitination site on TEAD and the gene expression analysis of Hippo Signaling are unclear. Furthermore, the multiple proteomics, RNA-seq, and ATAC-seq data must be made publicly available upon publication to ensure reproducibility. Most of the main figures are of low resolution, which needs addressing.

      Strengths:<br /> - A broad range of assays was used to robustly determine the role of RNF146 in TEAD degradation.<br /> - Development of novel PROTAC for degrading TEAD.

      Weaknesses:<br /> - An orthogonal approach is needed (e.g., PARP1 inhibitor) to demonstrate PARP1's dependency in TEAD ubiquitination.

      - The data from Table 2 is unclear in illustrating the association of identified K48 ubiquitination with TEAD4, especially since the experiments were presumably to be conducted on whole cell lysates with KGG enrichment. This raises the possibility that the K48 ubiquitination could originate from other proteins. Alternatively, if the authors performed immunoprecipitation on TEAD followed by mass spectrometry, this should be explicitly described in the text and materials and methods section.

      - Figure 2D: The methodology for measuring the Hippo signature is unclear, as is the case for Figures 7E and F regarding the analysis of Hippo target genes.

      - Figure S3F requires quantification with additional replicates for validation.

      - There is a misleading claim in the discussion stating "TEAD PARylation by PAR-family members (Figure 3)"; however, the demonstration is only for PARP1, which should be corrected.

    2. eLife assessment

      This important and comprehensive study describes the development of a heterobifunctional degrader, which is used to provide insights into the mechanism of TEAD proteolysis, with potential implications for signaling pathways in cancer. While the methods are solid, the analyses and descriptions are still incomplete. With further molecular refinements, experimental controls, and a more cohesive and unified story, this article will be of interest to cancer biologists and scientists interested in proteostasis, cellular signaling, and post-translation modification of proteins.

    3. Reviewer #1 (Public Review):

      Summary:<br /> In the first half of this study, Pham et al. investigate the regulation of TEAD via ubiquitination and PARylation, identifying an E3 ubiquitin ligase, RNF146, as a negative regulator of TEAD activity through an siRNA screen of ubiquitin-related genes in MCF7 cells. The study also finds that depletion of PARP1 reduced TEAD4 ubiquitination levels, suggesting a certain relationship between TEAD4 PARylation and ubiquitination which was also explored through an interesting D70A mutation. Pham et al. subsequently tested this regulation in D. melanogaster by introducing Hpo loss-of-function mutations and rescuing the overgrowth phenotype through RNF146 overexpression.

      In the second half of this study, Pham et al. designed and assayed several potential TEAD degraders with a heterobifunctional design, which they term TEAD-CIDE. Compounds D and E were found to effectively degrade pan-TEAD, an effect which could be disrupted by treatment with TEAD lipid pocket binders, proteasome inhibitors, or E1 inhibitors, demonstrating that the TEAD-CIDEs operate in a proteasome-dependent manner. These TEAD-CIDEs could reduce cell proliferation in OVCAR-8, a YAP-deficient cell line, but not SK-N-FI, a Hippo pathway independent cell line. Finally, this study also utilizes ATAC-seq on Compound D to identify reductions in chromatin accessibility at the regions enriched for TEAD DNA binding motifs.

      Strengths:<br /> The study provides compelling evidence that the E3 ubiquitin ligase RNF146 is a novel negative regulator of TEAD activity. The authors convincingly delineate the mechanism through multiple techniques and approaches. The authors also describe the development of heterobifunctional pan-degraders of TEAD, which could serve as valuable reagents to more deeply study TEAD biology.

      Weaknesses:<br /> The scope of this study is extremely broad. The first half of the paper highlights the mechanisms underlying TEAD degradation; however, the connection to the second half of the paper on small molecule degraders of TEAD is jarring, and it seems as though two separate stories were combined into this single massive study. In my opinion, the study would be stronger if it chose to focus on only one of these topics and instead went deeper.

      Additionally, the figure clarity needs to be substantially improved, as readability and interpretation were difficult in many panels. Lastly, there are numerous typos and poor grammar throughout the text that need to be addressed.

    4. Reviewer #2 (Public Review):

      The paper is made of two parts. One deals with RNF146, the other with the development of compounds that may cause TEAD degradation. The two parts are rather unrelated to each other.

      The main limit of this work is the lack of evidence that TEAD factors are in fact regulated by the proteasome and ubiquitylation under endogenous conditions. Also lacking is the demonstration that TEADs are labile proteins to the extent that such quantitative regulation at the level of stability can impact on YAP-TAZ biology. Without these two elements, the relevance and physiological significance of all these data is lacking.

      As for the development of new inhibitors of TEAD, this is potentially very interesting but underdeveloped in this manuscript. Irrespectively, if TEAD is stable, these molecules are likely lead compounds of interest. If TEAD is unstable, as entertained in the first part of the paper, then these molecules are likely marginal.

      Here are a few other specific observations:

      1 The effect of MG is shown in a convoluted way, by MS. What about endogenous TEAD protein stability?

      2 The relevance of siRNF on YAP target genes of Fig.2D is not statistically significant.

      3 All assays are with protein overexpression and Ub-laddering

      4 An inconsistency exists on the only biological validation (only by overexpression) on the fly eye size. RNF gain in Fig4C is doing the opposite of what is expected from what is portrayed here as a YAP/TEAD inhibitor: RNF gain is shown to INCREASE eye size, phenocopying a Hippo loss of function phenotype. According to the model proposed, RNF addition should reduce eye size. The authors stated that " This is in contrast to the anti-growth effect of RNF-146 in the Hpo loss-of-function background and indicates RNF146 may regulate other genes/pathways controlling eye sizes besides its role as a negative regulator of Sd/yki activity". This raises questions on what the authors are really studying: why, according to the authors, these caveats should occur on the controls, and not when they study Hpo mutants?

      5 The role of TEAD inactivation on YAP function is already well known. Disappointingly no prior literature is cited. In any case, this is a mere control.

      6 The second part of the paper on the Development and Screening of pan-TEAD lipid pocket degraders is interesting but unconnected to the above. The degradation pathway it involves has nothing to do with the enzyme described in the first figures.

      7 The role of CIDE on YAP accessibility to Chromatin is superficially executed. Key controls are missing along with the connection with mechanisms and prior knowledge, of TEAD, YAP, chromatin, and other TEAD inhibitors, just to mention a few.

      8 The physiological relevance and the mechanistic interpretation of what should be in the ATAC seq in ovcar cells is missing.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We would firstly like to thank all reviewers for their comments and support of this manuscript.

      Reviewer #1 (Recommendations For The Authors):

      No further recommendations.

      Reviewer #2 (Recommendations For The Authors):

      All of my comments have been sufficiently addressed.

      Reviewer #3 (Recommendations For The Authors):

      Thanks for responding to my former recommendations constructively. I believe these points have been fully addressed in this new version.

      However, I have not seen any comments on the points I raised in my former public review concerning the I-2 dependence of the FonSIX4 cell death. Do you know whether FonSIX4 would trigger cell death in tissues not expressing any I-2?

      We are a little confused concerning this comment. I-2 is a different class of resistance protein (NLR) that recognises Avr2 and this is likely to be intracellular. From the previous public review, we believe reviewer 3 may have been asking us to clarify the dependence of I (MM or M82) on FonSIX4 cell death. We have performed these controls by expressing FonSIX4 and associated FonSIX4/Avr1 chimeras in N. benthamiana (with the PR-1 signal peptide for efficient secretion of effectors) and it does not cause cell death in the absence of the I receptor – see S11F Fig. This was not explicitly conveyed in text so we have included the following in text: “Using the N. benthamiana assay we show FonSIX4 is recognised by I receptors from both cultivars (IM82 and iMoneymaker) and cell death is dependent on the presence of IM82 or iMoneymaker (Fig 5B, S11 Fig).”

      I still recommend discussing whether the Avr1 residues crucial for Avr activity are in the same structural regions of the C-terminal domain where previous work has identified residues under diversifying selection in symbiotic fungal FOLD proteins.

      The region important for recognition does encompass some residues within the structural region identified to be under diversifying selection in FOLD effectors from Rhizophagus irregularis previously reported (two residues within one beta-strand). However, we also see residues that don’t overlap to this area. We also note that the mycFOLD proteins analysed in symbiotic fungi are heavily skewed towards strong structurally similarity with FolSIX6 (similar cysteine spacing within both N and C-domains and structural orientation of the N and C-domains) rather than Avr1. We are under the impression that Avr1 was not included in the analysis of diversifying selection in symbiotic fungal FOLD proteins, it also is unclear to us if close Avr1 homologues are present. With this in mind, and considering our already lengthy discussion (as previously highlighted during reviewer), we have decided not to include further discussion concerning this point.


      The following is the authors’ response to the original reviews.

      We would like to thank the editor(s) and reviewers for their work concerning our manuscript. Most of the suggested changes were related to text changes which we have incorporated into the revised version. Please find our response to reviewers below.

      Reviewer #1 (Recommendations For The Authors):

      I only have very minor suggestions for the authors. The first one comes from reading the manuscript and finding it very dense with so many acronyms. This will limit the audience that will read the study and appreciate its impact. This is more noticeable in the Results, with many passages that I would suggest moving to Methodology.

      We thank reviewer 1 for their very positive review. We understand that due to the nature of this study, which includes many protein alleles/mutations that were expressed with different boundaries etc., it is difficult to achieve this. Reviewer 2 asked for more details to be provided. We hope we have achieved a nice balance in the revised manuscript.

      Something else that would facilitate the reading of the manuscript is the effectors name. The authors use the SIX name or the Avr name for some effectors and it makes it difficult to follow up.

      We have tried to make this consistent for Avr1 (SIX4), Avr2 (SIX3) and Avr3 (SIX1). Other SIX effectors are not known Avrs so the SIX names were used.

      Reading the manuscript and seeing how in most of the sections the authors used a computational approach followed by an experimental approach, I wonder why Alphafold2-multimer was not used to investigate the interaction between the effector and the receptor?

      This is a great suggestion, we have certainly investigated this, however to date there is no experimental evidence to directly support the direct interaction between I and Avr1. Post review, we spent some time trying to capture an interaction using a co-immunoprecipitation approach however to date we have not been able to obtain robust data that support this. We are currently looking to study this utilising protein biophysics/biochemistry but this work will take some time.

      Reviewer #2 (Recommendations For The Authors):

      We thank reviewer 2 for the very thorough editing and recommendations. We have incorporated all minor text edits below into the manuscript.

      Line 43: perhaps "Effector recognition" instead of "Effector detection", to be consistent with line 51?

      Line 60: Change to "leads".

      Line 79: Italicise Avr2.

      Line 94: Add the acronym ETI in parentheses after "effector-triggered immunity".

      Line 106: "(Leptosphaeria Avirulence-Supressing)" should be "(Leptosphaeria Avirulence and Supressing)".

      Line 112: Change "defined" to "define".

      Line 119: Spell out the species name on first use.

      Line 205: Glomeromycota is a division rather than a genus. Consistent with Fig 2, it also does not need to italicized.

      Line 207: Change "basidiomycete" to "Division Basidiomycota", consistent with Fig 2.

      Line 214: Change "alignment of Avr1, Avr3, SIX6 and SIX13" to "alignment of the mature Avr1, Avr3, SIX6 and SIX13 sequences".

      Line 324: Change "solved structures" to "solved protein structures".

      Line 335: Spell out acronyms like "MS" on first use in figure legends. Also dpi in other figure legends.

      Line 341: replace "effector-triggered immunity (ETI)" with "(ETI)" - see comment on Line 94.

      Line 370: Change "domains" to "domain".

      Line 374: In the title, change "C-terminus" to C-domain", consistent with the rest of the figure legend.

      Line 404: Change "(basidiomycetes and ascomycetes)" to "(Basidiomycota and Ascomycota fungi)", consistent with Fig 2C.

      Line 416: Change "in" to "by".

      Line 427: un-italicize the parentheses.

      Line 519: First mention of NLR. Spell out the acronym on first use in main text. S5 and S11 figure titles should be bolded.

      Line 852: Replace "@" with "at".

      S4 Table: Gene names should be italicised.

      S5 Table: Needs to be indicated that the primer sequences are in the 5´-3´ orientation.

      With regards to the Agrobacterium tumefaciens-mediated transient expression assays involving co-expression of the Avr1 effector and I immune receptor, the authors need to make clear how many biological replicates were performed as this information is only provided for the ion leakage assay.

      We have added these data to the figure legend

      Line 57: For me, the text "Fol secretes a limited number of structurally related effectors" reads as Fol secretes structurally related effectors, but very few of them are structurally related. Perhaps it would be better to say that the effector repertoire of Fol is made up of proteins that adopt a limited number of structural folds, or that the effector repertoire can be classified into a reduced set of structural families?

      This edit has been incorporated.

      Lines 66-67: Subtle re-wording required for "The best-characterized pathosystem is F. oxysporum f. sp. lycopersici (Fol)", as a pathosystem is made up of a pathogen and its host. Perhaps "The best-characterized pathosystem involves F. oxysporum f. sp. lycopersici (Fol) and tomato".

      Sentence has been reworded.

      Line 113 and throughout: Stick with one of "resistance protein", "receptor", "immune receptor" and "immunity receptor" throughout the manuscript.

      We have decided to use both receptor and immunity receptor as not all receptors investigated in the manuscript provide immunity.

      Lines 149-150: The title does not fully represent what is shown in the figure. The text "that is unique among fungal effectors" can be deleted as there is nothing in Fig 1 that shows that the fold is unique to fungal effectors.

      Figure title has been changed.

      Line 173: The RMSD of Avr3 is stated as being 3.7 Å, but in S3 Fig it is stated as being 3.6 Å.

      This was a mistake in the main text and has been corrected.

      Lines 202-204: This sentence needs to be reworded, as the way that it is written implies that the Diversispora and Rhizophagus genera are in the Ascomycota division. Also, "Ascomycetes" should be changed to "Ascomycota fungi", consistent with Fig 2.

      Sentence has been reworded.

      Line 233: "Scores above 8". What type of scores? Z-scores?

      These are Z-scores. This has been added in text.

      Lines 242-246: It is stated that SIX9 and SIX11 share structural similarity to various RNA-binding proteins, but no scores used to make these assessments is given. The scores should be provided in the text.

      Z-scores have been added.

      Fig 4A: SIX3 should be Avr2, consistent with line 292. The gene names should be italicised in Fig 4A.

      SIX3 was changed to Avr2. Gene names have been italicised.

      Line 356: Subtle rewording required, as "co-infiltrated with both IM82 and iMoneymaker" implies that you infiltrated with protein rather than Agrobacterium strains.

      Sentence has been reworded.

      Fig 5A, Fig 5C and Line 380: Light blue is used, but this looks grey. Perhaps change colour, as grey is already used to show the pro-domain in Fig 5A (or simply change the colour used to highlight the pro-domain)?

      Colour depicting the C-domain was changed.

      Lines 530-531: This text is no longer correct. Rlm4 and Rlm3 are now known to be alleles of Rlm9. See: Haddadi, P., Larkan, N. J., Van deWouw, A., Zhang, Y., Neik, T. X., Beynon, E., ... & Borhan, M. H. (2022). Brassica napus genes Rlm4 and Rlm7, conferring resistance to Leptosphaeria maculans, are alleles of the Rlm9 wall‐associated kinase‐like resistance locus. Plant Biotechnology Journal, 20(7), 1229.

      We thank the reviewer for picking this up. This text has been updated.

      Line 553: Provide more information on what the PR1 signal peptide is.

      More information about the PR1 signal peptide has been added.

      Lines 767-781: Descriptions and naming conventions of proteins throughout the figure legend need to be consistent and better reflect their makeup. For example, I think it would be best to put the sequence range after each protein mentioned - e.g. Avr118-242 or Avr159-242 instead of Avr1, PSL1_C37S18-111 instead of PSL1_C37S, etc. Furthermore, it is often stated that a protein is full-length when it lacks a signal peptide - my thought is that if a proteins lack its signal peptide, it is not full-length. The acronym "PD" also needs to be spelled out as "pro-domain (PD)" in the figure legend.

      We have incorporated sequence range for proteins that were produced upon first use. Sequence ranges that were modelled in AlphaFold2 were not added in text because they can be found in Supplementary Table 3.

      Lines 853-845: It is stated the sizes of proteins are indicated above the chromatogram in S10 Fig, but this is not the case. It is also not clear from S10B Fig that the faint peaks correspond to the peaks in the Fig 4B chromatogram. In S10D Fig, the stick of C58S is difficult to see. Perhaps change the colour or use an arrow/asterisk?

      Protein size estimates have been added above the chromatogram. Added text to indicate that the faint peaks correspond to peaks in Fig 4B. Added an asterisk in S10D Fig to identify the location of C58.

      S14 Fig is not mentioned/referenced in the main text of the manuscript.

      This was a mistake and has been added.

      The reference list needs to be updated to accommodate those referenced bioRxiv preprints that have now been published in peer-reviewed journals.

      The reference list has been updated.

      Reviewer #3 (Recommendations For The Authors):

      It would be good to discuss whether the pro-domains affecting virulence or avirulence activity.

      Kex2, the protease that cleaves the pro-domain functions in the golgi. We therefore suspect that the pro-domain is removed prior to secretion. For recombinant protein production in E. coli we find that these pro-domains are necessary to obtain soluble protein (doi: 10.1111/nph.17516). As we require the pro-domain for protein production and can not completely removing them from our preps, we cannot perform experiments to test this and subsequently comment further. In a paper that identified SIX effectors in tomato utilising proteomics approach (https://bsppjournals.onlinelibrary.wiley.com/doi/10.1111/j.1364-3703.2007.00384.x), it appears that the pro-domains were not captured in this analysis. This supports the conclusion that they are not associated with the mature/secreted protein.

      The authors stated that the C-terminal domain of SIX6 has a single disulfide bond unique to SIX6. Please clarify in which context is it unique: in Fusarium or across all FOLD proteins?

      This is in direct comparison to Avr1 and Avr3. The disulfide in the C-domain of SIX6 is unique compared to Avr1 and Avr3. This has been made clear in text.

      The structural similarity of FOLD proteins to other known structures have been discussed (lines 460ff), but it is not clear whether all structures and models identified in this work would yield cysteine inhibitor and tumor necrosis factors as best structural matches in the database or whether this is specific to a single FOLD protein. Please consider discussing recently published findings by others (Teulet et al. 2023, New Phytologist) on this aspect.

      This analysis was performed for Avr1, we obtained relatively low similarity hits for Avr3/Six6. We have updated this text accordingly… “Unfortunately, the FOLD effectors share little overall structural similarity with known structures in the PDB outside of the similarity with each other. At a domain level, the N-domain of the FOLD effector Avr1 has some structural similarities with cystatin cysteine protease inhibitors (PDB code: 4N6V, PDB code: 5ZC1) [60, 61], and the C-domain with tumour necrosis factors (PDB code: 6X83) [62] and carbohydrate-binding lectins (PDB code: 2WQ4) [63]. Relatively weak hits were observed for Avr3/Six6.”

      It might be useful to clearly point out that the ToxA fold and the C-terminus of the FOLD fold are different.

      We have secondary structural topology maps of the FOLD and ToxA-like families in S8 Fig which highlight the differences in topology between these two families.

      Please add information to Fig.S8 listing the approach to generate the secondary structure topology maps.

      We have added this information in the figure caption.

    2. Reviewer #1 (Public Review):

      Yu et al. investigated Fusarium oxysporum f. sp. lycopersici SIX effectors structure using experimental and computational approaches, and while doing so, the authors identified several SIX effectors as member of the FOLD family, and expanded the known diversity of the SIX effectors. A very interesting and novel finding is the presence of FOLD putative effectors in other Ascomycetes secretome, sharing structural similarities with SIX effectors Avr1, Avr3 and SIX6.

      By performing technically sound predictions and experimental confirmation, the authors also confirmed co-operative interactions between Fol effectors, something that was previously known for different pairs of proteins, expanding this observation for new SIX effectors. In addition, the authors contributed to the understanding of the interaction Fol effectors, specifically FOLD and LARS effectors, - I receptors to suppress immunity by structurally similar effectors.

      The conclusions of this paper are supported by data and I think it is a pioneer study analyzing the correspondence between AlphaFold predictions and experimentally derived structures, highlighting that models can answer the scientific questions in some cases but could not be enough in others.

    3. Reviewer #2 (Public Review):

      Yu et al. investigated the structural landscape of 'secreted in xylem' (SIX) effector (virulence and avirulence) proteins from the plant-pathogenic fungus, Fusarium oxysporum f. sp. lycopersici (Fol), with the goal of better understanding effector function and recognition by host (tomato) immune receptors. In recent years, several experimental and computational studies have shown that many effector proteins of plant-associated fungi can be assigned to one of a few major structural families. In the study by Yu et al., X-ray crystallography was used to show that two avirulence effectors of Fol, Avr1 (SIX4) and Avr3 (SIX1), which are recognized by the tomato immune receptors I and I-3, respectively, form part of a new structural family, the Fol dual-domain (FOLD) family, found across three fungal divisions. Using AlphaFold2, an ab initio structural prediction tool, the authors then predicted the structures of all proteins within the Fol SIX effector repertoire (and other effector candidates) and provided evidence that two other effectors, SIX6 and SIX13, also belong to this family.

      In addition to identifying members of the FOLD family, structural prediction revealed that proteins of the Fol effector repertoire can largely be classified into a reduced set of structural families. Examples included four members of the ToxA-like family (including Avr2 (SIX3) and SIX8), as well as four members of a new family, Family 4 (including SIX5 and PSL1). Given previous studies had demonstrated that Avr2 (ToxA-like) and SIX5 (Family 4) interact and function together, and that the genes encoding these proteins are divergently transcribed, and because homologues of SIX8 (ToxA-like) and PSL1 (Family 4) from another Fusarium pathogen are functionally dependent on each other and, in the case of Fol, are encoded by genes that are next to each other in the genome, the authors hypothesized that SIX8 and PSL1 may also physically interact. In line with this, co-incubation of the SIX8 and PSL1 proteins, followed by size exclusion chromatography (SEC), gave elution and gel migration profiles consistent with interaction in the form of a heterodimer. AlphaFold2-Multimer modelling then suggested that this interaction was mediated through an intermolecular disulfide bond. Such a prediction was subsequently confirmed through mutational analysis of the relevant cysteine residue in each protein in conjunction with SEC.

      Finally, using a variant (homologue) of Avr1 from another Fusarium pathogen, as well as chimeric forms of this protein that integrated regions of Avr1 from Fol, Yu et al. determined through co-expression assays in Nicotiana benthamiana with the I immune receptor, as well as subsequent ion leakage assays, that the C-domain of Avr1 is recognized by the I immune receptor. Furthermore, through these assays, the authors were also able to show that surface-exposed residues in the C-domain enable Avr1 to evade recognition by a variant of the I receptor in Moneymaker tomato that does not provide resistance to Fol.

      Overall, the manuscript presents a large body of work that is well supported by the data. A key strength of the manuscript is the validation (benchmarking) of protein structures predicted using AlphaFold2, which is a first for large-scale effector structure prediction papers published to date. Another key strength is the use of large-scale effector structure predictions to make hypotheses about functional relationships or interactions that are then tested (i.e. the SIX8-PSL1 protein interaction and recognition of Avr1 by the I immune receptor). This testing again goes above and beyond the large-scale effector structure prediction papers published to date. Taken together, this showcases how experimental and computational experiments can be effectively combined to provide biologically relevant data for the plant protection and molecular plant-microbe interactions fields.

      In terms of weaknesses, the manuscript could have validated the SIX8-PSL1 protein interaction with in planta experiments, such as co-immunoprecipitation assays or co-localization experiments in conjunction with confocal microscopy, to provide support for the interaction in a plant setting. However, given what is already known about the Avr2-SIX5 interaction, these additional experiments are not crucial and could instead form part of a follow-up study.

    4. Reviewer #3 (Public Review):

      In this work, the authors shed light onto the structures of Fusarium oxysporum f.sp. lycopersici proteins involved in the infection of tomato. They unravelled several new secreted effector protein structures and additionally used computational approaches to structurally classify the remaining effectors known from this pathogen. Through this they uncovered a new and unique structural class of proteins which they found to be present and widely distributed in fungal plant pathogens and plant symbiotic fungi. The authors further predicted structural models for the full SIX effector set revealing that genome-proximal effector pairs share similar structural classes. Building on their Avr1 structure, the authors also define the C-terminal domain and specific amino acid residues that are essential to Avr1 detection by its cognate immune receptor.

      A major strength of this work is a portfolio of several (Avr1, Avr3, SIX6, SIX8) new structurally resolved proteins which led to the discovery that several of them fall into the same structural class. These findings are supported by strong results.

      The experiments addressing the structure-function relationship of Avr1's avirulence activity are highly relevant to our understanding of disease resistance mechanisms against Fusarium. Additional controls would allow for better support of the conclusions to be drawn. An example is FonSIX4's cell death activity in N.benthamiana leaves and whether FonSIX4 cdll death is indeed dependent on the tomato I receptor. Complementary work in Fusarium mutants lacking Avr1 and expressing chimeric versions would document that the observations from transient expressions in Nicotiana benthamiana are relevant in the biological context of a Fusarium/tomato interaction.

      The discovered solvent-exposed residues conditioning Avr1 recognition by the I receptor seem to be positioned in an area of the protein which had previously been highlighted as being highly variable in FOLD proteins of symbiotic fungi but it is not clear from the work whether this is indeed the case or whether Avr1 differs significantly in its structure from FOLD proteins found in other fungi.<br /> It remains to be tested whether the residues conditioning avirulence activity are also crucial for virulence activity in Fusarium.

      This work uncovered a new structural class of proteins with critical roles in plant-pathogen interactions. Structure-based predictions and genome-wide comparisons have emerged as a new approach enabling the identification of similar proteins with divergent sequences. The work undertaken by the authors adds to a growing body of work in plant-microbe research, predominantly from pathogenic organisms, and more recently in symbiotic fungi.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors found that nifuroxazide has the potential to augment the efficacy of radiotherapy in HCC by reducing PD-L1 expression. This effect may be attributed to increased degradation of PD-L1 through the ubiquitination-proteasome pathway. The paper provides new ideas and insights to improve treatment effectiveness, however, there are additional points that could be addressed.

      • The paper highlights that the combination of nifuroxazide increases tumor cell apoptosis. A discussion regarding the potential crosstalk or regulatory mechanisms between apoptotic pathways and PD-L1 expression would be valuable.

      Response: Thank you very much for your suggestion. Research has shown that regulating the STAT3/PD-L1 pathway can effectively increase apoptosis in lung cancer cells (1). Our study confirmed that nifuroxazide can effectively inhibit the expression of p-STAT3 and PD-L1 in liver cancer cells, which may be the reason for the increased apoptosis of these cells. We have added relevant descriptions in the discussion.

      • The benefits and advantages of nifuroxazide combination could be compared to the current clinical treatment options.

      Response: Thank you greatly for your insightful feedback. The primary objective of this study is to explore whether nifuroxazide can effectively enhance the degradation of PD-L1, thereby increasing the radiosensitivity of HCC. Our research reveals that compared to radiation therapy alone, combination therapy involving nifuroxazide and radiation significantly inhibits tumor growth in mice and boosts the anti-tumor immune response. This finding could potentially provide a valuable strategy for patients who exhibit resistance to radiation therapy in clinical practice. Moreover, clinical trial investigations have demonstrated that nivolumab, a PD-1 monoclonal antibody, when combined with radiation therapy for HCC, exhibits promising safety and efficacy (2). This evidence supports the future application of nifuroxazide in the treatment of HCC. However, to reach this objective, we must continue to conduct extensive research, including comparing nifuroxazide with existing therapies in clinical practice. We believe that nifuroxazide not only significantly inhibits the expression of PD-L1 protein in HCC cells but also functions as a PD-L1 inhibitor. Furthermore, it effectively curbs the proliferation and migration of HCC cells, induces tumor cell apoptosis, and may exhibit enhanced anti-tumor effects, making it a promising candidate for clinical use. We have incorporated relevant discussion content in the article to address these points.

      Reviewer #2 (Public Review):

      Summary:

      Zhao et al. aimed to explore an important question - how to overcome the resistance of hepatocellular carcinoma cells to radiotherapy? Given that the immune-suppressive microenvironment is a major mechanism underlying resistance to radiotherapy, they reasoned that a drug that blocks the PD-1/PD-L1 pathway could improve the efficacy of radiation therapy and chose to investigate the effect of Nifuroxazide, an inhibitor of stat3 activation, on radiotherapy efficacy in treating hepatocellular carcinoma cells. From in vitro experiments, they find combination treatment (Nifuroxazide+ radiotherapy) increases apoptosis and reduces proliferation and migration, in comparison to radiotherapy alone. From in vivo experiments, they demonstrate that combined treatment reduces the size and weight of tumors in vivo and enhances mice survival. These data indicate a better efficacy of combination therapy compared to radiotherapy alone. Moreover, they also determined the effect of combination therapy on tumor microenvironment as well as peripheral immune response. They find that combination therapy increases infiltration of CD4+ and CD8+ cells as well as M1 macrophages in the tumor microenvironment. Interestingly, they find that the ratio of Treg cells in spleen is increased by radiotherapy but decreased by Nifuroxazide. Considering the immune-suppressive role of Treg cells, this finding is consistent with reduced tumor growth by combination therapy. However, it is unclear whether the combined therapy affects the ratio of Treg cells in the tumors or not. The most intriguing part of the study is the determination of the effect of Nifuroxazide on PD-L1 expression in the context of radiotherapy. Considering Nifuroxazide is a stat3 activation inhibitor and stat3 inhibition leads to reduced expression of PD-L1, one would expect Nifuroxazide decreases PD-L1 expression through stat3. However, they found that the effect of Nifuroxazide on PD-L1 is dependent on GSK3 mediated Proteasome pathways and independent of stat3, in the given experimental context. To determine the relevance to human hepatocellular carcinoma, they also measured the PD-L1 expression in human tumor tissues of HCC patients pre- and post-radiotherapy. The increased PD-L1 expression level in HCC after radiotherapy is impressive. However, it is unclear whether the patients being selected in the study had resistant disease to radiotherapy or not.

      Overall, the data are convincing and supportive to the conclusions.

      Strengths:

      1) Novel finding: Identified novel mechanism underlying the effect of Nifuroxazide on PD-L1 expression in hepatocellular carcinoma cells in the context of radiotherapy.

      2) Comprehensive experimental approaches: using different approaches to prove the same finding. For example, in Fig 4, both IHC and WB were used. In Fig 5, both IF and WB were used.

      3) Human disease relevance: Compared observations in mice with human tumor samples.

      The question in the summary, “However, it is unclear whether the combined therapy affects the ratio of Treg cells in the tumors or not”.

      Response: Thank you very much for your valuable feedback. We have included additional flow cytometry results regarding the expression of relevant Treg cells (CD4+CD25+Foxp3+ T lymphocytes) in tumor tissues (Supplementary Fig 2). Our findings indicate that the number of Treg cells in tumor tissues significantly decreased following combination therapy with nifuroxazide and radiotherapy.

      The question in the summary, “However, it is unclear whether the patients being selected in the study had resistant disease to radiotherapy or not”.

      Response: Thank you very much for your valuable feedback. All the HCC patients selected in this study experienced recurrence after radiation treatment.

      Weaknesses:

      1) It is hard to tell whether the observed phenotype and mechanism are generic or specific to the limited cell lines used in the study. The in vitro experiments were performed in one human cell line and the in vivo experiments were performed in one mouse cell line.

      Response: Thank you very much for your feedback. We have included additional experimental data from another human cell line Huh7 (Supplementary Fig 3).

      2) The study did not distinguish the effect of increased radiosensitivity by nifuroxazide from combined anti-tumor effects by two different treatments.

      Response: Thank you greatly for your insightful feedback. In this study, we primarily compared the antitumor effects of nifuroxazide combined with radiotherapy versus either nifuroxazide or radiotherapy alone, and confirmed that the combined treatment demonstrated a more potent anti-hepatocellular carcinoma effect compared to single therapy. Furthermore, to achieve the goal of utilizing nifuroxazide for the treatment of clinical hepatocellular carcinoma, additional research is necessary, including comparisons with other clinically established therapies. We have also incorporated relevant discussions in our analysis.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors embarked on an exploration of how nifuroxazide could enhance the responsiveness to radiotherapy by employing both an in vitro cell culture system and an in vivo mouse tumor model.

      Strengths:

      The researchers conducted an array of experiments aimed at revealing the function of nifuroxazide in aiding the radiotherapy-induced reduction of proliferation, migration, and invasion of HepG2 cells.

      Weaknesses:

      The authors did not provide the molecular mechanism through which nifuroxazide collaborates with radiotherapy to effectively curtail the proliferation, migration, and invasion of HCC cells. Moreover, the evidence supporting the assertion that nifuroxazide contributes to the degradation of radiotherapy-induced upregulation of PD-L1 via the ubiquitin-proteasome pathway appears to be insufficient. Importantly, further validation of this discovery should involve the utilization of an additional syngeneic mouse HCC tumor model or an orthotopic HCC tumor model.

      Response: Thank you very much for your insightful comments. Nifuroxazide has been demonstrated to inhibit the expression of p-STAT3, thereby suppressing tumor cell proliferation and migration (3, 4). In our study, we observed that after 48 hours of treatment with Nifuroxazide, the expression of p-STAT3 in irradiated cells was significantly inhibited. Furthermore, compared to radiation alone, combined Nifuroxazide and radiotherapy resulted in a more pronounced decrease in PCNA expression. Simultaneously, we performed additional detection of migration-related protein MMP2 expression (revised Fig 2B), confirming that combined Nifuroxazide and radiotherapy led to a more significant inhibition of MMP2 expression. These findings suggest that the combined treatment may be responsible for the synergistic suppression of HCC cell proliferation and migration. We have included relevant discussions in our manuscript.

      Our initial results indicate that Nifuroxazide inhibits the expression of PD-L1 at the protein level, but does not affect its mRNA level. Interestingly, upon treatment with a proteasome inhibitor MG132, the inhibitory effect of Nifuroxazide on PD-L1 was eliminated, suggesting that Nifuroxazide may enhance the degradation of PD-L1 protein. Our experiments have demonstrated the inhibitory effect of Nifuroxazide on PD-L1 in both human and mouse cell lines. However, to translate these findings into clinical application for the treatment of hepatocellular carcinoma, additional research is necessary, including validation in genetically engineered mouse models of HCC. We have addressed these points in the discussion section of our manuscript.

      Reviewer #1 (Recommendations For The Authors):

      1) Please improve the quality of Figure 3E. It is hard to figure out the bar and details.

      Response: Thank you for your valuable feedback. We have meticulously revised the figures to enhance their clarity and presentation (revised Fig 3E).

      2) In Figure 7E, please elucidate the methods used for calculating the amount of PD-L1 mRNA level. Please adjust the picture angle and label the marker size on the left as well

      Response: Thank you for your feedback. We have incorporated a method for calculating PD-L1 mRNA levels and revised the corresponding figures accordingly (revised Fig 7E).

      Reviewer #2 (Recommendations For The Authors):

      Questions:

      1) What is the advantage of using a combination of nifuroxazide and radiotherapy in comparison to using a combination of anti-PD1/PDL1 and radiotherapy?

      Response: Thank you very much for your insightful comments. We believe that the advantage of nifuroxazide over PD-1 or PD-L1 antibodies lies in its ability not only to effectively inhibit PD-L1 expression but also to suppress tumor cell proliferation, migration, and promote cell apoptosis (Supplementary Fig 1). We have also expanded on these aspects in the discussion section of the manuscript.

      2) For the characterization of tumor microenvironment and immune cells in the spleen, were the same cell populations being investigated? What about NK and Treg cells in tumors? What about M1 macrophages in spleen?

      Response: Thank you very much for your insightful suggestion. We have measured the infiltration of NK and Treg cells in tumor tissues (Supplementary Fig 2), as well as the abundance of M1 macrophages (revised Fig 6) in the spleen, and provided additional relevant data to strengthen our study.

      Other comments:

      1) The data in Fig 1 is solid. However, it is hard to distinguish the effect of increased radiosensitivity by nifuroxazide from combined anti-tumor effects by two different treatments. The anti-tumor role of Nifuroxazide has been reported in melanoma, colorectal carcinoma, and hepatocellular carcinoma previously (PMID: 26830149; 28055016, 26154152). Therefore, the increased apoptosis and decreased proliferation and migration could be caused by nifuroxazide and not related to the sensitivity of cells to radiation therapy.

      Response: Thank you very much for your constructive feedback. As you suggested, the anti-tumor role of nifuroxazide has been reported. However, the innovation of our study does not lie in confirming its antitumor effects but rather in demonstrating how nifuroxazide can enhance radiotherapy's efficacy in treating hepatocellular carcinoma by inhibiting PD-L1 levels.

      We compared the efficacy of combined therapy versus radiotherapy and found that compared to radiation alone, combined therapy more significantly inhibited hepatocellular carcinoma cell proliferation and migration. In our animal model, we compared the therapeutic effects of combined therapy, nifuroxazide, and radiotherapy on hepatocellular carcinoma-bearing mice. We observed that compared to individual treatment groups, combined therapy more profoundly suppressed tumor growth and enhanced the antitumor effects in the mice.

      In response to your feedback, we have expanded the discussion on the impact of combined therapy versus nifuroxazide or radiotherapy on hepatocellular carcinoma cell proliferation, migration, and apoptosis (Supplementary Fig 1). The data show that compared to either individual therapy, combined therapy further inhibited cell proliferation and migration while promoting apoptosis.

      2) There is no direct evidence to show the improved efficacy of radiation therapy by nifuroxazide through the degradation of PD-L1.

      Response: Thank you very much for your valuable suggestions. In our cell experiments, we found that nifuroxazide inhibits the increased expression of PD-L1 in cells induced by radiation therapy, and this inhibitory effect is counteracted when using the proteasome inhibitor MG132. Therefore, we speculate that nifuroxazide may inhibit PD-L1 expression through a proteasome-dependent mechanism. To better reflect this, we have revised the title of our manuscript to "Nifuroxazide Suppresses PD-L1 Expression and Enhances the Efficacy of Radiotherapy in Hepatocellular Carcinoma."

      3) "The oncogene Stat3.....was effectively inhibited by radiotherapy in cells" - this sentence may be rephrased to make the point clear. The authors might mean to say "activation of the oncogene stat3...."

      "The results demonstrated that the combination therapy increased the expression of PARP," the authors might mean to say "expression of c-PARP"

      Response: Thank you very much for your feedback. We have revised the relevant sentence descriptions to improve clarity and accuracy.

      4) "histomorphology significantly improved after the treatment with nifuroxazide and radiation therapy (Fig 3E)." How to define "improved histomorphology"? The authors may want to provide more details to clarify "improved".

      Response: Thank you very much for your feedback. We have revised the relevant sentence descriptions to improve clarity and accuracy.

      5) In addition to normalizing protein expression by tubulin, the authors may consider normalizing p-stat3 expression level by stat3.

      Response: Thank you very much for your feedback. We have conducted a quantitative analysis of the expression levels of p-STAT3 and STAT3 (revised Fig 2A).

      6) Figure 3C and D, using a different color to represent each group might help the readers to better differentiate each group.

      Response: Thank you very much for your feedback. Following your suggestion, we have revised the figures accordingly (revised Fig 3C and 3D).

      Reviewer #3 (Recommendations For The Authors):

      In this study, the authors revealed the pivotal role of nifuroxazide in augmenting the efficacy of radiotherapy. This was evidenced by its synergistic effect in suppressing the proliferation and migratory capabilities of HCC cells, alongside its capacity to induce apoptosis in these cells. Furthermore, their findings underscored the substantial synergy between nifuroxazide and radiotherapy in retarding tumor growth, thereby extending survival rates in a tumor-bearing murine model. Moreover, the authors observed that nifuroxazide combined with radiotherapy significantly increases the tumor-infiltrating CD4+ T cells, CD8+ T cells, and M1 macrophages. Finally, the authors found that nifuroxazide countered the radiotherapy-induced upregulation of PD-L1 through the ubiquitin-proteasome pathway. However, the evidence for supporting the main claims is only partially supported. The following are my concerns and suggestions.

      1) In Figures 1 and 2, the authors convincingly demonstrate the synergistic impact of nifuroxazide and radiotherapy on curtailing the proliferation, colony formation, and migratory capabilities of HCC cells, while also instigating apoptosis in these cells. However, the underlying molecular mechanism remains elusive. A recent study highlighted nifuroxazide's potential to impede the proliferation of glioblastoma cells and induce apoptosis via the MAP3K1/JAK2/STAT3 pathway (Wang X., et al., Int Immunopharmacol. 2023 May;118:109987. doi: 10.1016/j.intimp.2023.109987). It would be valuable for the authors to investigate whether nifuroxazide employs a similar molecular mechanism to regulate proliferation and apoptosis in the context of HCC. This could offer deeper insights into the mechanisms at play in their observed effects.

      Response: Thank you very much for your insightful comments. As you pointed out, previous studies have reported that nifuroxazide exerts antitumor effects by inhibiting the STAT3 pathway. However, in our experiments, we observed that radiation therapy significantly increased the expression of PD-L1, but showed a trend of decreased p-STAT3 expression. Therefore, we believe that nifuroxazide does not inhibit PD-L1 expression through the STAT3 pathway. Subsequently, our further research revealed that the inhibitory effect of nifuroxazide on PD-L1 can be counteracted by a proteasome inhibitor. Thus, we propose that nifuroxazide inhibits PD-L1 expression through a proteasome-dependent mechanism, thereby enhancing the efficacy of radiation therapy in hepatocellular carcinoma.

      2) Figures 1 and 2 solely rely on the HepG2 cell line to establish their conclusions. To validate these findings robustly, it is recommended that another HCC cell line be included in the study. This additional cell line will contribute to the generalizability and reliability of the results, enhancing the overall credibility of the study's conclusions.

      Response: Thank you very much for your suggestion. We have included additional experimental results with the relevant cell line Huh7 (supplementary Fig 3).

      3) Figure 3 demonstrates the use of only one syngeneic mouse H22 tumor model. To ensure the robustness and validity of this finding, it would be advisable to incorporate at least one more syngeneic mouse HCC tumor model or even an orthotopic mouse tumor model. The inclusion of additional models would bolster the significance and reliability of the observed results, contributing to a more comprehensive understanding of the phenomenon under investigation.

      Response: Thank you for your valuable suggestion. In the H22 mouse tumor model, we conducted relevant assessments of survival rate and tumor growth. The results confirm that the combination of nifuroxazide and radiation therapy exhibits a promising synergistic antitumor effect. However, to achieve the goal of applying nifuroxazide combined with radiation therapy for the treatment of clinical hepatocellular carcinoma, we still need to undertake extensive research, including validation on genetically identical mouse HCC tumor models. We have also included relevant discussions in our ongoing discussions.

      4) In Figure 5, employing an alternative method, such as the flow cytometry assay, to analyze and corroborate the tumor-infiltrating immune cell profiling following various treatments would enhance the rigor of the study. This additional approach would provide a complementary perspective and validate the findings, strengthening the overall reliability and impact of the results presented.

      Response: Thank you for your insightful suggestion. We have included additional experimental data to strengthen our study (supplementary Fig 2).

      5) In Figure 7, the conclusion drawn regarding nifuroxazide's impact on PD-L1 expression through ubiquitination-proteasome mechanisms seems to lack the robust evidence needed to firmly establish nifuroxazide's role in regulating PD-L1 ubiquitination. To reinforce this aspect of the study, the authors may conduct comprehensive in vitro and in vivo ubiquitination assays. Performing these assays would offer direct insights into whether nifuroxazide genuinely influences PD-L1 ubiquitination, thus fortifying the credibility and importance of the reported findings.

      Response: Thank you for your valuable feedback. Our initial findings suggest that nifuroxazide inhibits the expression of PD-L1 protein levels, but does not affect the mRNA levels. Moreover, upon treatment with the proteasome inhibitor MG132, the inhibitory effect of nifuroxazide on PD-L1 was found to be abolished. Concurrently, we observed that nifuroxazide significantly enhances GSK-3β expression in both cell and animal experiments. Consequently, we propose that nifuroxazide augments the degradation of PD-L1 protein.

      6) Statistical methods should be included in the captions of all the figures with statistical graphs. The size of the scale should be supplemented with a description in the captions.

      Response: Thank you for your valuable suggestion. We have made the appropriate modifications to our study based on your recommendations.

      7) Considering the outcomes presented in the study, it appears that the title "Nifuroxazide enhances radiotherapy efficacy against hepatocellular carcinoma by upregulating PD-L1 degradation via the ubiquitin-proteasome pathway" may not accurately reflect the findings.

      Response: Thank you for your insightful feedback. We have revised the title to read, "Inhibitory Effects of Nifuroxazide on PD-L1 Expression and Enhanced Radiotherapy Efficacy in Hepatocellular Carcinoma".

      References

      1) Xie C, Zhou X, Liang C, Li X, Ge M, Chen Y, et al. Apatinib triggers autophagic and apoptotic cell death via VEGFR2/STAT3/PD-L1 and ROS/Nrf2/p62 signaling in lung cancer. Journal of experimental & clinical cancer research : CR. 2021;40(1):266. doi: 10.1186/s13046-021-02069-4.

      2) de la Torre-Alaez M, Matilla A, Varela M, Inarrairaegui M, Reig M, Lledo JL, et al. Nivolumab after selective internal radiation therapy for the treatment of hepatocellular carcinoma: a phase 2, single-arm study. Journal for immunotherapy of cancer. 2022;10(11). doi: 10.1136/jitc-2022-005457.

      3) Yang F, Hu M, Lei Q, Xia Y, Zhu Y, Song X, et al. Nifuroxazide induces apoptosis and impairs pulmonary metastasis in breast cancer model. Cell Death Dis. 2015;6(3):e1701. doi: 10.1038/cddis.2015.63.

      4) Nelson EA, Walker SR, Kepich A, Gashin LB, Hideshima T, Ikeda H, et al. Nifuroxazide inhibits survival of multiple myeloma cells by directly inhibiting STAT3. Blood. 2008;112(13):5095-102. doi: 10.1182/blood-2007-12-129718.

    2. eLife assessment

      This valuable study evaluates the effects of nifuroxazide on radiotherapy for the treatment of hepatocellular carcinoma. Solid evidence is provided to support the conclusion that nifuroxazide facilitates the downregulation of PD-L1 and may improve therapy outcomes when combined with radiotherapy, though the inclusion of additional cell lines and animal models would have strengthened the study. This work will be of interest to cancer biologists and those working in immuno-oncology.

    3. Reviewer #1 (Public Review):

      The author found the nifuroxazide has the potential to augment the efficacy of radiotherapy in HCC by reducing PD-L1 expression. This effect may be attributed to increased degradation of PD-L1 through the ubiquitination-proteasome pathway. These evidences support the future application of nifuroxazide in the treatment of HCC.

    4. Reviewer #2 (Public Review):

      Summary:<br /> Zhao et al. aimed to explore an important question-how to overcome resistance of hepatocellular carcinoma cells to radiotherapy. Given that immune-suppressive microenvironment is a major mechanism underlying resistance to radiotherapy, they reasoned that a drug that blocks PD-1/PD-L1 pathway could improve efficacy of radiation therapy and chose to investigate the effect of Nifuroxazide, an inhibitor of stat3 activation, on radiotherapy efficacy in treating hepatocellular carcinoma cells. From in vitro experiments, they find combination treatment (Nifuroxazide+ radiotherapy) increases apoptosis and reduces proliferation and migration, in comparison to radiotherapy alone. From in vivo experiments, they demonstrate that combined treatment reduces size and weight of tumors in vivo and enhances mice survival. These data indicate a better efficacy of combination therapy compared to radiotherapy alone. Moreover, they also determined the effect of combination therapy on tumor microenvironment as well as peripheral immune response. Specifically, they find that combination therapy increases infiltration of CD4+, CD8+ t cells and NK cells, activates CD8+ t cells, enhances polarization of M1 macrophages and decreases Treg cells in the tumor microenvironment. These changes in tumor microenvironment is consistent with reduced tumor growth by combination therapy. The most intriguing part of the study is the determination of effect of Nifuroxazide on PD-L1 expression in the context of radiotherapy. Considering Nifuroxazide is a stat3 activation inhibitor and stat3 inhibition leads to reduced expression of PD-L1, one would expect Nifuroxazide decreases PD-L1 expression through stat3. However, they find the effect of Nifuroxazide on PD-L1 is dependent on GSK3 mediated Proteasome pathways and independent of stat3, in the given experimental context. To determine the relevance to human hepatocellular carcinoma, they also measured the PD-L1 expression in human tumor tissues of HCC patients pre- and post-radiotherapy. The increased PD-L1 expression level in HCC after radiotherapy is impressive.<br /> Overall, the data are convincing and supportive to the conclusions.

      Strengths:<br /> 1) Novel finding: Identified novel mechanism underlying effect of Nifuroxazide on PD-L1 expression in hepatocellular carcinoma cells in the context of radiotherapy.<br /> 2) Comprehensive experimental approaches: using different approaches to prove same finding. For example, Fig4, both IHC and WB were used. Fig5. Both IF and WB were used.<br /> 3) Human disease relevance: Compared observations in mice with human tumor samples.

    5. Reviewer #3 (Public Review):

      Summary:<br /> In this study, the authors investigated the potential of nifuroxazide to enhance responsiveness to radiotherapy, employing both an in vitro cell culture system and an in vivo syngeneic mouse tumor model.

      Strengths:<br /> The researchers conducted a series of experiments to elucidate the role of nifuroxazide in facilitating the radiotherapy-induced reduction of proliferation, migration, and invasion of HepG2 cells.

      Weaknesses:<br /> The evidence supporting the claim that nifuroxazide contributes to the degradation of radiotherapy-induced upregulation of PD-L1 via the ubiquitin-proteasome pathway is still relatively weak.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work presents H3-OPT, a deep learning method that effectively combines existing techniques for the prediction of antibody structure. This work is important because the method can aid the design of antibodies, which are key tools in many research and industrial applications. The experiments for validation are solid.

      Comments to Author:

      Several points remain partially unclear, such as:

      1). Which examples constitute proper validation;

      Thank you for your kind reminder. We have modified the text of the experiments for validation to identify which examples constitute proper validation. We have corrected the “Finally, H3-OPT also shows lower Cα-RMSDs compared to AF2 or tFold-Ab for the majority of targets in an expanded benchmark dataset, including all antibody structures from CAMEO 2022” into “Finally, H3-OPT also shows lower Cα-RMSDs compared to AF2 or tFold-Ab for the majority (six of seven) of targets in an expanded benchmark dataset, including all antibody structures from CAMEO 2022” and added the following sentence in the experimental validation section of our revised manuscript to clarify which examples constitute proper validation: “AlphaFold2 outperformed IgFold on these targets”.

      2) What the relevance of the molecular dynamics calculations as performed is;

      Thank you for your comment, and I apologize for any confusion. The goal of our molecular dynamics calculations is to compare the differences in binding affinities, an important issue of antibody engineering, between AlphaFold2-predicted complexes and H3-OPT-predicted complexes. Molecular dynamics simulations enable the investigation of the dynamic behaviors and interactions of these complexes over time. Unlike other tools for predicting binding free energy, MM/PBSA or MM/GBSA calculations provide dynamic properties of complexes by sampling conformational space, which helps in obtaining more accurate estimates of binding free energy. In summary, our molecular dynamics calculations demonstrated that the binding free energies of H3-OPT-predicted complexes are closer to those of native complexes. We have included the following sentence in our manuscript to provide an explanation of the molecular dynamics calculations: “Since affinity prediction plays a crucial role in antibody therapeutics engineering, we performed MD simulations to compare the differences in binding affinities between AF2-predicted complexes and H3-OPT-predicted complexes.”.

      3) The statistics for some of the comparisons;

      Thank you for the comment. We have incorporated statistics for some of the comparisons in the revised version of our manuscript and added the following sentence in the Methods section: “We conducted two-sided t-test analyses to assess the statistical significance of differences between the various groups. Statistical significance was considered when the p-values were less than 0.05. These statistical analyses were carried out using Python 3.10 with the Scipy library (version 1.10.1).”.

      4) The lack of comparison with other existing methods.

      We appreciate your valuable comments and suggestions. Conducting comparisons with a broader set of existing methods can further facilitate discussions on the strengths and weaknesses of each method, as well as the accuracy of our method. In our study, we conducted a comparison of H3-OPT with many existing methods, including AlphaFold2, HelixFold-Single, ESMFold, and IgFold. We demonstrated that several protein structure prediction methods, such as ESMFold and HelixFold-Single, do not match the accuracy of AlphaFold2 in CDR-H3 prediction. Additionally, we performed a detailed comparison between H3-OPT, AlphaFold2, and IgFold (the latest antibody structure prediction method) for each target.

      We sincerely thank the comment and have introduced a comparison with OmegaFold. The results have been incorporated into the relevant sections (Fig 4a-b) of the revised manuscript.

      Author response image 1.

      Public Reviews

      Comments to Author:

      Reviewer #1 (Public Review):

      Summary:

      The authors developed a deep learning method called H3-OPT, which combines the strength of AF2 and PLM to reach better prediction accuracy of antibody CDR-H3 loops than AF2 and IgFold. These improvements will have an impact on antibody structure prediction and design.

      Strengths:

      The training data are carefully selected and clustered, the network design is simple and effective.

      The improvements include smaller average Ca RMSD, backbone RMSD, side chain RMSD, more accurate surface residues and/or SASA, and more accurate H3 loop-antigen contacts.

      The performance is validated from multiple angles.

      Weaknesses:

      1) There are very limited prediction-then-validation cases, basically just one case.

      Thanks for pointing out this issue. The number of prediction-then-validation cases is helpful to show the generalization ability of our model. However, obtaining experimental structures is both costly and labor-intensive. Furthermore, experimental validation cases only capture a limited portion of the sequence space in comparison to the broader diversity of antibody sequences.

      To address this challenge, we have collected different datasets to serve as benchmarks for evaluating the performance of H3-OPT, including our non-redundant test set and the CAMEO dataset. The introduction of these datasets allows for effective assessments of H3-OPT’s performance without biases and tackles the obstacle of limited prediction-then-validation cases.

      Reviewer #2 (Public Review):

      This work provides a new tool (H3-Opt) for the prediction of antibody and nanobody structures, based on the combination of AlphaFold2 and a pre-trained protein language model, with a focus on predicting the challenging CDR-H3 loops with enhanced accuracy than previously developed approaches. This task is of high value for the development of new therapeutic antibodies. The paper provides an external validation consisting of 131 sequences, with further analysis of the results by segregating the test sets into three subsets of varying difficulty and comparison with other available methods. Furthermore, the approach was validated by comparing three experimentally solved 3D structures of anti-VEGF nanobodies with the H3-Opt predictions

      Strengths:

      The experimental design to train and validate the new approach has been clearly described, including the dataset compilation and its representative sampling into training, validation and test sets, and structure preparation. The results of the in-silico validation are quite convincing and support the authors' conclusions.

      The datasets used to train and validate the tool and the code are made available by the authors, which ensures transparency and reproducibility, and allows future benchmarking exercises with incoming new tools.

      Compared to AlphaFold2, the authors' optimization seems to produce better results for the most challenging subsets of the test set.

      Weaknesses:

      1) The scope of the binding affinity prediction using molecular dynamics is not that clearly justified in the paper.

      We sincerely appreciate your valuable comment. We have added the following sentence in our manuscript to justify the scope of the molecular dynamics calculations: “Since affinity prediction plays a crucial role in antibody therapeutics engineering, we performed MD simulations to compare the differences in binding affinities between AF2-predicted complexes and H3-OPT-predicted complexes.”.

      2) Some parts of the manuscript should be clarified, particularly the ones that relate to the experimental validation of the predictions made by the reported method. It is not absolutely clear whether the experimental validation is truly a prospective validation. Since the methodological aspects of the experimental determination are not provided here, it seems that this may not be the case. This is a key aspect of the manuscript that should be described more clearly.

      Thank you for the reminder about experimental validation of our predictions. The sequence identities of the wild-type nanobody VH domain and H3 loop, when compared with the best template, are 0.816 and 0.647, respectively. As a result, these mutants exhibited low sequence similarity to our dataset, indicating the absence of prediction bias for these targets. Thus, H3-OPT outperformed IgFold on these mutants, demonstrating our model's strong generalization ability. In summary, the experimental validation actually serves as a prospective validation.

      Thanks for your comments, we have added the following sentence to provide the methodological aspects of the experimental determination: “The protein expression, purification and crystallization experiments were described previously. The proteins used in the crystallization experiments were unlabeled. Upon thawing the frozen protein on ice, we performed a centrifugation step to eliminate any potential crystal nucleus and precipitants. Subsequently, we mixed the protein at a 1:1 ratio with commercial crystal condition kits using the sitting-drop vapor diffusion method facilitated by the Protein Crystallization Screening System (TTP LabTech, mosquito). After several days of optimization, single crystals were successfully cultivated at 21°C and promptly flash-frozen in liquid nitrogen. The diffraction data from various crystals were collected at the Shanghai Synchrotron Research Facility and subsequently processed using the aquarium pipeline.”

      3) Some Figures would benefit from a clearer presentation.

      We sincerely thanks for your careful reading. According to your comments, we have made extensive modifications to make our presentation more convincing and clearer (Fig 2c-f).

      Author response image 2.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript introduces a new computational framework for choosing 'the best method' according to the case for getting the best possible structural prediction for the CDR-H3 loop. The authors show their strategy improves on average the accuracy of the predictions on datasets of increasing difficulty in comparison to several state-of-the-art methods. They also show the benefits of improving the structural predictions of the CDR-H3 in the evaluation of different properties that may be relevant for drug discovery and therapeutic design.

      Strengths:

      The authors introduce a novel framework, which can be easily adapted and improved. The authors use a well-defined dataset to test their new method. A modest average accuracy gain is obtained in comparison to other state-of-the art methods for the same task while avoiding testing different prediction approaches.

      Weaknesses:

      1) The accuracy gain is mainly ascribed to easy cases, while the accuracy and precision for moderate to challenging cases are comparable to other PLM methods (see Fig. 4b and Extended Data Fig. 2). That raises the question: how likely is it to be in a moderate or challenging scenario? For example, it is not clear whether the comparison to the solved X-ray structures of anti-VEGF nanobodies represents an easy or challenging case for H3-OPT. The mutant nanobodies seem not to provide any further validation as the single mutations are very far away from the CDR-H3 loop and they do not disrupt the structure in any way. Indeed, RMSD values follow the same trend in H3-OPT and IgFold predictions (Fig. 4c). A more challenging test and interesting application could be solving the structure of a designed or mutated CDR-H3 loop.

      Thank you for your rigorous consideration. When the experimental structure is unavailable, it is difficult to directly determinate whether the target is easy-to-predict or challenging. We have conducted our non-redundant test set in which the number of easy-to-predict targets is comparable to the other two groups. Due to the limited availability of experimental antibody structures, especially nanobody structures, accurately predicting CDR-H3 remains a challenge. In our manuscript, we discuss the strengths and weakness of AlphaFold2 and other PLM-based methods, and we introduce H3-OPT as a comprehensive solution for antibody CDR3 modeling.

      We also appreciate your comment on experimental structures. We fully agree with your opinion and made attempts to solve the experimental structures of seven mutants, including two mutants (Y95F and Q118N) which are close to CDR-H3 loop. Unfortunately, we tried seven different reagent kits with a total of 672 crystallization conditions, but were unable to obtain crystals for these mutants. Despite the mutants we successfully solved may not have significantly disrupted the structures of CDR-H3 loops, they have still provided valuable insights into the differences between MSA-based methods and MSA-free methods (such as IgFold) for antibody structure modeling.

      We have further conducted a benchmarking study using two examples, PDBID 5U15 and 5U0R, both consisting of 18 residues in CDR-H3, to evaluate H3-OPT's performance in predicting mutated H3 loops. In the first case (target 5U15), AlphaFold2 failed to provide an accurate prediction of the extended orientation of the H3 loop, resulting in a less accurate prediction (Cα-RMSD = 10.25 Å) compared to H3-OPT (Cα-RMSD = 5.56 Å). In the second case (target 5U0R, a mutant of 5U15 in CDR3 loop), AlphaFold2 and H3-OPT achieved Cα-RMSDs of 6.10 Å and 4.25 Å, respectively. Additionally, the Cα-RMSDs of OmegaFold predictions were 8.05 Å and 9.84 Å, respectively. These findings suggest that both AlphaFold2 and OmegaFold effectively captured the mutation effects on conformations but achieved lower accuracy in predicting long CDR3 loops when compared to H3-OPT.

      2) The proposed method lacks a confidence score or a warning to help guide the users in moderate to challenging cases.

      We appreciate your suggestions and we have trained a separate module to predict confidence scores. We used the MSE loss for confidence prediction, where the label error was calculated as the Cα deviation of each residue after alignment. The inputs of this module are the same as those used for H3-OPT, and it generates a confidence score ranging from 0 to 100.

      3) The fact that AF2 outperforms H3-OPT in some particular cases (e.g. Fig. 2c and Extended Data Fig. 3) raises the question: is there still room for improvements? It is not clear how sensible is H3-OPT to the defined parameters. In the same line, bench-marking against other available prediction algorithms, such as OmegaFold, could shed light on the actual accuracy limit. We totally understand your concern. Many papers have suggested that PLM-based models are computationally efficient but may have unsatisfactory accuracy when high-resolution templates and MSA are available (Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies, Ruffolo, J. A. et al, 2023). However, the accuracy of AF2 decreased substantially when the MSA information is limited. Therefore, we directly retained high-confidence structures of AF2 and introduced a PSPM to improve the accuracy of the targets with long CDR-H3 loops and few sequence homologs. The improvement in mean Cα-RMSD demonstrated the room for accurately predicting CDR-H3 loops.

      We also appreciate your kind comment on defined parameters. In fact, once a benchmark dataset is established, determining an optimal cutoff value through parameter searching can indeed further improve the performance of H3-OPT in CDR3 structure prediction. However, it is important to note that this optimal cutoff value heavily depends on the testing dataset being used. Therefore, we provide a recommended cutoff value and offer a program interface for users who wish to manually define the cutoff value based on their specific requirements. Here, we showed the average Cα-RMSDs of our test set under different confidence cutoffs and the results have been added in the text accordingly.

      Author response table 1.

      We also appreciate your reminder, and we have conducted a benchmark against OmegaFold. The results have been included in the manuscript (Fig 4a-b).

      Author response image 3.

      Reviewer #1 (Recommendations For The Authors):

      1) In Fig 3a, please also compare IgFold and H3-OPT (merge Fig. S2 into Fig 3a)

      In Fig 3b, please separate Sub2 and Sub3, and add IgFold's performance.

      Thank you very much for your professional advice. We have made revisions to the figures based on your suggestions.

      Author response image 4.

      2) For the three experimentally solved structures of anti-VEGF nanobodies, what are the sequence identities of the VH domain and H3 loop, compared to the best available template? What is the length of the H3 loop? Which category (Sub1/2/3) do the targets belong to? What is the performance of AF2 or AF2-Multimer on the three targets?

      We feel sorry for these confusions. The sequence identities of the VH domain and H3 loop are 0.816 and 0.647, respectively, comparing with the best template. The CDR-H3 lengths of these nanobodies are both 17. According to our classification strategy, these nanobodies belong to Sub1. The confidence scores of these AlphaFold2 predicted loops were all higher than 0.8, and these loops were accepted as the outputs of H3-OPT by CBM.

      3) Is AF2-Multimer better than AF2, when using the sequences of antibody VH and antigen as input?

      Thanks for your suggestions. Many papers have benchmarked AlphaFold2-Multimer for protein complex modeling and demonstrated the accuracy of AlphaFold2-Multimer on predicting the protein complex is far from satisfactory (Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants, Rui Yin, et al., 2022). Additionally, there is no significantly difference between AlphaFold2 and AlphaFold2-Multimer on antibody modeling (Structural Modeling of Nanobodies: A Benchmark of State-of-the-Art Artificial Intelligence Programs, Mario S. Valdés-Tresanco, et al., 2023)

      From the data perspective, we employed a non-redundant dataset for training and validation. Since these structures are valuable, considering the antigen sequence would reduce the size of our dataset, potentially leading to underfitting.

      4) For H3 loop grafting, I noticed that only identical target and template H3 sequences can trigger grafting (lines 348-349). How many such cases are in the test set?

      We appreciate your comment from this perspective. There are thirty targets in our database with identical CDR-H3 templates.

      Reviewer #2 (Recommendations For The Authors):

      • It is not clear to me whether the three structures apparently used as experimental confirmation of the predictions have been determined previously in this study or not. This is a key aspect, as a retrospective validation does not have the same conceptual value as a prospective, a posteriori validation. Please note that different parts of the text suggest different things in this regard "The model was validated by experimentally solving three structures of anti-VEGF nanobodies predicted by H3-OPT" is not exactly the same as "we then sought to validate H3-OPT using three experimentally determined structures of anti-VEGF nanobodies, including a wild-type (WT) and two mutant (Mut1 and Mut2) structures, that were recently deposited in protein data bank". The authors are kindly advised to make this point clear. By the way, "protein data bank" should be in upper case letters.

      We gratefully thank you for your feedback and fully understand your concerns. To validate the performance of H3-OPT, we initially solved the structures of both the wild-type and mutants of anti-VEGF nanobodies and submitted these structures to Protein Data Bank. We have corrected “that were recently deposited in protein data bank” into “that were recently deposited in Protein Data Bank” in our revised manuscript.

      • It would be good to clarify the goal and importance of the binding affinity prediction, as it seems a bit disconnected from the rest of the paper. Also, it would be good to include the production MD runs as Sup, Mat.

      Thanks for your valuable comment. We have added the following sentence in our manuscript to clarify the goal and importance of the molecular dynamics calculations: “Since affinity prediction plays a crucial role in antibody therapeutics engineering, we performed MD simulations to compare the differences in binding affinities between AF2-predicted complexes and H3-OPT-predicted complexes.”. The details of production runs have been described in Method section.

      • Has any statistical test been performed to compare the mean Cα-RMSD values across the modeling approaches included in the benchmark exercise?

      Thanks for this kind recommendation. We conducted a statistical test to assess the performance of different modeling approaches and demonstrated significant improvements with H3-OPT compared to other methods (p<0.001). Additionally, we have trained H3-OPT with five random seeds and compared mean Cα-RMSD values with all five models of AF2. Here, we showed the average Cα-RMSDs of H3-OPT and AlphaFold2.

      Author response table 1.

      • In Fig. 2c-f, I think it would be adequate to make the ordering criterion of the data points explicit in the caption or the graph itself.

      We appreciate your comment and suggestion. We have revised the graph in the manuscript accordingly.

      Author response image 5.

      • Please revise Figure S2 caption and/or its content. It is not clear, in parts b and c, which is the performance of H3-OPT. Why weren´t some other antibody-specific tools such as IgFold included in this comparison?

      Thanks for your comments. The performance of H3-OPT is not included in Figure S2. Prior to training H3-OPT, we conducted several preliminary studies, and the detailed results are available in the supplementary sections. We showed that AlphaFold2 outperformed other methods (including AI-based methods and TBM methods) and produced sub-angstrom predictions in framework regions. The comparison of IgFold with other methods was discussed in a previous work (Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies, Ruffolo, J. A. et al, 2023). In that study, we found that IgFold largely yielded results comparable to AlphaFold2 but with lower prediction cost. Additionally, we have also conducted a detailed comparison of CDR-H3 loops with IgFold in our main text.

      • It is stated that "The relative binding affinities of the antigen-antibody complexes were evaluated using the Python script...". Which Python script?

      Thank you for your comments, and I apologize for the confusion. This python script is a module of AMBER software, we have corrected “The relative binding affinities of the antigen-antibody complexes were evaluated using the python script” into “The relative binding affinities of the antigen-antibody complexes were evaluated using the MMPBSA module of AMBER software”.

      Reviewer #3 (Recommendations For The Authors):

      Does H3-OPT improve the AF2 score on the CDR-H3? It would be interesting to see whether grafted and PSPM loops improve the pLDDT score by using for example AF2Rank [https://doi.org/10.1103/PhysRevLett.129.238101]. That could also be a way to include a confidence score into H3-OPT.

      We are so grateful for your kind question. H3-OPT could not provide a confidence score for output in current version, so we did not know whether H3-OPT improve the AF2 score or not.

      We appreciate your kind recommendations and have calculated the pLDDT scores of all models predicted by H3-OPT and AF2 using AF2Rank. We showed that the average of pLDDT scores of different predicted models did not match the results of Cα-RMSD values.

      Author response table 3.

      Therefore, we have trained a separate module to predict the confidence score of the optimized CDR-H3 loops. We hope that this module can provide users with reliable guidance on whether to use predicted CDR-H3 loops.

      The test case of Nb PDB id. 8CWU is an interesting example where AF2 outperforms H3-OPT and PLMs. The top AF2 model according to ColabFold (using default options and no template [https://doi.org/10.1038/s41592-022-01488-1]) shows a remarkably good model of the CDR-H3, explaining the low Ca-RMSD in the Extended Data Fig. 3. However, the pLDDT score of the 4 tip residues (out of 12), forming the hairpin of the CDR-H3 loop, pushes down the average value bellow the CBM cut-off of 80. I wonder if there is a lesson to learn from that test case. How sensible is H3-OPT to the CBM cut-off definition? Have the authors tried weighting the residue pLDDT score by some structural criteria before averaging? I guess AF2 may have less confidence in hydrophobic tip residues in exposed loops as the solvent context may not provide enough support for the pLDDT score.

      Thanks for your valuable feedback. We showed the average Cα-RMSDs of our test set under different confidence cutoffs and the results have been added in the text accordingly.

      Author response table 4.

      We greatly appreciate your comment on this perspective. Inspired on your kind suggestions, we will explore the relationship between cutoff values and structural information in related work. Your feedback is highly valuable as it will contribute to the development of our approach.

      A comparison against the new folding prediction method OmegaFold [https://doi.org/10.1101/2022.07.21.500999] is missed. OmegaFold seems to outperform AF2, ESM, and IgFold among others in predicting the CDR-H3 loop conformation (See [https://doi.org/10.3390/molecules28103991] and [https://doi.org/10.1101/2022.07.21.500999]). Indeed, prediction of anti-VEGF Nb structure (PDB WT_QF_0329, chain B in supplementary data) by OmegaFold as implemented in ColabFold [https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/beta/omegafold.ipynb] and setting 10 cycles, renders Ca-RMSD 1.472 Å for CDR-H3 (residues 98-115).

      We appreciate your valuable suggestion. We have added the comparison against OmegaFold in our manuscript. The results have been included in the manuscript (Fig 4a-b).

      Author response image 6.

      In our test set, OmegaFold outperformed ESMFold in predicting the CDR-H3 loop conformation. However, it failed to match the accuracy of AF2, IgFold, and H3-OPT. We discussed the difference between MSA-based methods (such as AlphaFold2) and MSA-free methods (such as IgFold) in predicting CDR-H3 loops. Similarly, OmegaFold provided comparative results with HelixFold-Single and other MSA-free methods but still failed to match the accuracy of AlphaFold2 and H3-OPT on Sub1.

      The time-consuming step in H3-OPT is the AF2 prediction. However, most of the time is spent in modeling the mAb and Nb scaffolds, which are already very well predicted by PLMs (See Fig. 4 in [https://doi.org/10.3390/molecules28103991]). Hence, why not use e.g. OmegaFold as the first step, whose score also correlates to the RMSD values [https://doi.org/10.3390/molecules28103991]? If that fails, then use AF2 or grafting. Alternatively, use a PLM model to generate a template, remove/mask the CDR loops (at least CDR-H3), and pass it as a template to AF2 to optimize the structure with or without MSA (e.g. using AF2Rank).

      Thanks for your professional feedbacks. It is really true that the speed of MSA searching limited the application of high-throughput structure prediction. Previous studies have demonstrated that the deep learning methods performed well on framework residues. We once tried to directly predict the conformations of CDR-H3 loops using PLM-based methods, but this initial version of H3-OPT lacking the CBM could not replicate the accuracy of AF2 in Sub1. Similarly, we showed that IgFold and OmegaFold also provide lower accuracy in Sub1 (average Cα-RMSD is 1.71 Å and 1.83 Å, respectively, whereas AF2 predicted an average of 1.07 Å). Therefore, The predictions of AlphaFold2 not only produce scaffolds but also provide the highest quality of CDR-H3 loops when high-resolution templates and MSA are available.

      Thank you once again for your kind recommendation. In the current version of H3-OPT, we have highlighted the strengths of H3-OPT in combining the AF2 and PLM models in various scenarios. AF2 can provide accurate predictions for short loops with fewer than 10 amino acids, and PLM-based models show little or no improvement in such cases. In the next version of H3-OPT, as the first step, we plan to replace the AF2 models with other methods if any accurate MSA-free method becomes available in the future.

      Line 115: The statement "IgFold provided higher accuracy in Sub3" is not supported by Fig. 2a.

      We are sorry for our carelessness. We have corrected “IgFold provided higher accuracy in Sub3” into “IgFold provided higher accuracy in Sub3 (Fig. 3a)”.

      Lines 195-203: What is the statistical significance of results in Fig 5a and 5b?

      Thank you for your kind comments. The surface residues of AF2 models are significantly higher than those of H3-OPT models (p < 0.005). In Fig. 5b, H3-OPT models predicted lower values than AF2 models in terms of various surface properties, including polarity (p <0.05) and hydrophilicity (p < 0.001).

      Lines 212-213: It is not easy to compare and quantify the differences between electrostatic maps in Fig. 5d. Showing a Dmap (e.g. mapmodel - mapexperiment) would be a better option. Additionally, there is no methodological description of how the maps were generated nor the scale of the represented potential.

      Thank you for pointing this out. We have modified the figure (Fig. 5d) according to your kind recommendation and added following sentences to clarify the methodological description on the surface electrostatic potential:

      “Analysis of surface electrostatic potential

      We generated two-dimensional projections of CDR-H3 loop’s surface electrostatic potential using SURFMAP v2.0.0 (based on GitHub from February 2023: commit: e0d51a10debc96775468912ccd8de01e239d1900) with default parameters. The 2D surface maps were calculated by subtracting the surface projection of H3-OPT or AF2 predicted H3 loops to their native structures.”

      Author response image 7.

      Lines 237-240 and Table 2: What is the meaning of comparing the average free energy of the whole set? Why free energies should be comparable among test cases? I think the correct way is to compare the mean pair-to-pair difference to the experimental structure. Similarly, reporting a precision in the order of 0.01 kcal/mol seems too precise for the used methodology, what is the statistical significance of the results? Were sampling issues accounted for by performing replicates or longer MDs?

      Thanks for your rigorous advice and pointing out these issues. We have modified the comparisons of free energies of different predicted methods and corrected the precision of these results. The average binding free energies of H3-OPT complexes is lower than AF2 predicted complexes, but there is no significant difference between these energies (p >0.05).

      Author response table 4.

      Comparison of binding affinities obtained from MD simulations using AF2 and H3-OPT.

      Thanks for your comments on this perspective. Longer MD simulations often achieve better convergence for the average behavior of the system, while replicates provide insights into the variability and robustness of the results. In our manuscript, each MD simulation had a length of 100 nanoseconds, with the initial 90 nanoseconds dedicated to achieving system equilibrium, which was verified by monitoring RMSD (Root Mean Square Deviation). The remaining 10 nanoseconds of each simulation were used for the calculation of free energy. This approach allowed us to balance the need for extensive sampling with the verification of system stability.

      Regarding MD simulations for CDR-H3 refinement, its successful application highly depends on the starting conformation, the force field, and the sampling strategy [https://doi.org/10.1021/acs.jctc.1c00341]. In particular, the applied plan MD seems a very limited strategy (there is not much information about the simulated times in the supplementary material). Similarly, local structure optimizations with QM methods are not expected to improve a starting conformation that is far from the experimental conformation.

      Thank you very much for your valuable feedback. We fully agree with your insights regarding the limitations of MD simulations. Before training H3-OPT, we showed the challenge of accurately predicting CDR-H3 structures. We then tried to optimize the CDR-H3 loops by computational tools, such as MD simulations and QM methods (detailed information of MD simulations is provided in the main text). Unfortunately, these methods were not expected to improve the accuracy of AF2 predicted CDR-H3 loops. These results showed that MD simulations and QM methods not only are time-consuming, but also failed to optimize the CDR-H3 loops. Therefore, we developed H3-OPT to tackle these issues and improve the accuracy of CDR3-H3 for the development of antibody therapeutics.

      Text improvements

      Relevant statistical and methodological parameters are presented in a dispersed manner throughout the text. For example, the number of structures in test, training, and validation datasets is first presented in the caption of Fig. 4. Similarly, the sequence identity % to define redundancy is defined in the caption of Fig. 1a instead of lines 87-88, where authors define "we constructed a non-redundant dataset with 1286 high-resolution (<2.5 Å)". Is the sequence redundancy for the CDR-H3 or the whole mAb/Nb?

      Thank you for pointing out these issues. We have added the number of structures in each subgroup in the caption of Fig. 1a: “Clustering of the filtered, high-resolution structures yielded three datasets for training (n = 1021), validation (n = 134), and testing (n = 131).” and corrected “As data quality has large effects on prediction accuracy, we constructed a non-redundant dataset with 1286 high-resolution (<2.5 Å) antibody structures from SAbDab” into “As data quality has large effects on prediction accuracy, we constructed a non-redundant dataset (sequence identity < 0.8) with 1286 high-resolution (<2.5 Å) antibody structures from SAbDab” in the revised manuscript. The sequence redundancy applies to the whole mAb/Nb.

      The description of ablation studies is not easy to follow. For example, what does removing TGM mean in practical terms (e.g. only AF2 is used, or PSPM is applied if AF2 score < 80)? Similarly, what does removing CBM mean in practical terms (e.g. all AF2 models are optimized by PSPM, and no grafting is done)? Thanks for your comments and suggestions. We have corrected “d, Differences in H3-OPT accuracy without the template module. e, Differences in H3-OPT accuracy without the CBM. f, Differences in H3-OPT accuracy without the TGM.” into “d, Differences in H3-OPT accuracy without the template module. This ablation study means only PSPM is used. e, Differences in H3-OPT accuracy without the CBM. This ablation study means input loop is optimized by TGM and PSPM. f, Differences in H3-OPT accuracy without the TGM. This ablation study means input loop is optimized by CBM and PSPM.”.

      Authors should report the values in the text using the same statistical descriptor that is used in the figures to help the analysis by the reader. For example, in lines 223-224 a precision score of 0.75 for H3-OPT is reported in the text (I assume this is the average value), while the median of ~0.85 is shown in Fig. 6a.

      Thank you for your careful checks. We have corrected “After identifying the contact residues of antigens by H3-OPT, we found that H3-OPT could substantially outperform AF2 (Fig. 6a), with a precision of 0.75 and accuracy of 0.94 compared to 0.66 precision and 0.92 accuracy of AF2.” into “After identifying the contact residues of antigens by H3-OPT, we found that H3-OPT could substantially outperform AF2 (Fig. 6a), with a median precision of 0.83 and accuracy of 0.97 compared to 0.64 precision and 0.95 accuracy of AF2.” in proper place of manuscript.

      Minor corrections

      Lines 91-94: What do length values mean? e.g. is 0-2 Å the RMSD from the experimental structure?

      We appreciate your comment and apologize for any confusion. The RMSD value is actually from experimental structure. The RMSD value evaluates the deviation of predicted CDR-H3 loop from native structure and also represents the degree of prediction difficulty in AlphaFold2 predictions. We have added following sentence in the proper place of the revised manuscript: “(RMSD, a measure of the difference between the predicted structure and an experimental or reference structure)”.

      Line 120: is the "AF2 confidence score" for the full-length or CDR-H3?

      We gratefully appreciate for your valuable comment and have corrected “Interestingly, we observed that AF2 confidence score shared a strong negative correlation with Cα-RMSDs (Pearson correlation coefficient =-0.67 (Fig. 2b)” into “Interestingly, we observed that AF2 confidence score of CDR-H3 shared a strong negative correlation with Cα-RMSDs (Pearson correlation coefficient =-0.67 (Fig. 2b)” in the revised manuscript.

      Line 166: Do authors mean "Taken" instead of "Token"?

      We are really sorry for our careless mistakes. Thank you for your reminder.

      Line 258: Reference to Fig. 1 seems wrong, do authors mean Fig. 4?

      We sincerely thank the reviewer for careful reading. As suggested by the reviewer, we have corrected the “Fig. 1” into “Fig. 4”.

      Author response image 7.

      Point out which plot corresponds to AF2 and which one to H3-OPT

      Thanks for pointing out this issue. We have added the legends of this figure in the proper positions in our manuscript.

    2. eLife assessment

      This paper presents H3-OPT, a deep learning method that effectively combines existing techniques for the prediction of antibody structure. This work is important because the method can aid in the design of antibodies, which are key tools in many research and industrial applications. The experiments for validation are convincing, but some further statistical evaluation would be helpful for the readers.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors developed a deep learning method called H3-OPT, which combines the strength of AF2 and PLM to reach better prediction accuracy of antibody CDR-H3 loops than AF2 and IgFold. These improvements will have an impact on antibody structure prediction and design.

      Strengths:

      The training data are carefully selected and clustered, the network design is simple and effective.

      The improvements include smaller average Ca RMSD, backbone RMSD, side chain RMSD, more accurate surface residues and/or SASA, and more accurate H3 loop-antigen contacts.

      The performance is validated from multiple angles.

      The revised manuscript has cleared my previous concerns.

    4. Reviewer #2 (Public Review):

      This work provides a new tool (H3-Opt) for the prediction of antibody and nanobody structures, based on the combination of AlphaFold2 and a pre-trained protein language model, with a focus on predicting the challenging CDR-H3 loops with enhanced accuracy than previously developed approaches. This task is of high value for the development of new therapeutic antibodies. The paper provides an external validation consisting of 131 sequences, with further analysis of the results by segregating the test sets in three subsets of varying difficulty and comparison with other available methods. Furthermore, the approach was validated by comparing three experimentally solved 3D structures of anti-VEGF nanobodies with the H3-Opt predictions

      Strengths:

      The experimental design to train and validate the new approach has been clearly described, including the dataset compilation and its representative sampling into training, validation and test sets, and structure preparation. The results of the in silico validation are quite convincing and support the authors' conclusions.

      The datasets used to train and validate the tool and the code are made available by the authors, which ensures transparency and reproducibiity, and allows future benchmarking exercises with incoming new tools.

      Compared to AlphaFold2, the authors' optimization seems to produce better results for the most challenging subsets of the test set.

      Weaknesses:

      The comparison of affinity predictions derived from AlphaFold2 and H3-opt models, based on molecular dynamics simulations, should have been discussed in depth. In some cases, there are huge differences between the estimations from H3-opt models and those from experimental structures. It seems that the authors obtained average differences of the real delta, instead of average differences of the absolute value of the delta. This can be misleading, because high negative differences might be compensated by high positive differences when computing the mean value. Moreover, it would have been good for the authors to disclose the trajectories from the MD simulations.

    5. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript introduces a new computational framework for choosing 'the best method' according to the case for getting the best possible structural prediction for the CDR-H3 loop. The authors show their strategy improves on average the accuracy of the predictions on datasets of increasingly difficulty in comparison to several state-of-the-art methods. They also show the benefits of improving the structural predictions of the CDR-H3 in the evaluation of different properties that may be relevant for drug discovery and therapeutics design.

      Strengths:<br /> Authors introduce a novel framework, which can be easily adapted and improved. Authors use a well defined dataset to test their new method. A modest average accuracy gain is obtained in comparison to other state-of-the art methods for the same task, while avoiding for testing different prediction approaches. Although the accuracy gain is mainly ascribed to easy cases, the accuracy and precision for moderate to challenging cases is comparable to the best PLM methods (see Fig. 4b and Extended Data Fig. 2), reflecting the present methodological limit in the field.

      Weaknesses:<br /> The proposed method lacks of a confidence score or a warning to help guiding the users in moderate to challenging cases.

    1. Author Response

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript aimed at elucidating the substrate specificity of two M23 endopeptidase Lysostaphin (LSS) and LytM in S. aureus. Endopeptidases are known to cleave the glycine-bridges of staphylococcal cell wall peptidoglycan (PG). To address this question, various glycine-bridge peptides were synthesized as substrates, the catalytic domain of LSS and LytM were recombinantly expressed and purified, and the reactions were analyzed using solution-state NMR. The major finding is that LytM is not only a Gly-Gly endopeptidase, but also cleaves D-Ala-Gly. Technically, the advantage of using real-time NMR was emphasized in the manuscript. The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM. However, the biological significance and relevance of the conclusions remain clear, as the results are mostly from synthetic substrates.

      Strengths:

      The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM.

      Weaknesses:

      1) Significance: while the current study provided a detailed analysis of various substrates, the conclusions are mainly based on synthesized peptides. One experiment used purified muropeptides (Fig. 3H); however, the results were unclear from this figure.

      We acknowledge the Reviewer for comments and concerns regarding the potential weaknesses of this study.

      Because peptidoglycan is insoluble, as such it is not amenable to solution-state NMR studies. However, soluble peptidoglycan (PG) fragments for NMR analyses can be obtained by digesting bacterial sacculi or via chemical synthesis. Whereas digestion results in mixtures of products, synthesis yields pure molecules. Analysis of NMR spectra of muropeptide-mimicking synthetic peptides before and after enzyme addition provides tools to identify peaks in the much more complex spectra of mutanolysin-treated sacculus.

      We will improve data presentation in Figure 3H in the revised version of our manuscript and emphasize the similarity of product peaks in spectra acquired from experiments using either synthetic peptides or mutanolysin-digested sacculus.

      The results from synthesized peptides may not necessarily correlate with their biological functions in vivo.

      The Reviewer refers several times to the use of synthetic peptides in this study. While it is unclear to us whether the concern is about the synthetic nature of the molecules or because the peptides are devoid of PG disaccharide units, it is true that PG fragments lack the 3D architecture present in intact sacculus, and thus cannot perfectly mimic the in vivo milieu. The fragments, as well as purified sacculus, also lack all other components present in an intact bacterial cell wall. Our largest synthetic peptide (7), however, represents a crosslinked muropeptide (stem-pentaGly-stem) which according to the structural model recently presented by Razew et al. (2023) (Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706) is large enough to cover the peptidic interaction interface between substrate and enzyme.

      Secondly, the study used only the catalytic domain of both proteins. It is known that the substrate specificity of these enzymes is regulated by their substrate-binding domains. There is no mention of other domains in the manuscript and no justification of why only the catalytic domain was studied. In short, the relevance of the results from the current study to the enzymes' actual physiological functions remains to be addressed, which attenuated the significance of the study.

      Lysostaphin catalytic domain was used for experimental simplicity and to allow direct comparison with LytM catalytic domain. Because lysostaphin cell-wall targeting (SH3b) domain interacts with the substrate with variable affinities depending on the substrate structure (Tossavainen et al., Structural and functional insights into lysostaphin-substrate interaction, Front. Mol. Biosci. 5, 60 (2018) and Gonzalez-Delgado et al., Two-site recognition of Staphylococcus aureus peptidoglycan by lysostaphin SH3b, Nat. Chem. Biol. 16, 24-30 (2020)), we would have had skewed results on kinetics because of this interaction.

      Catalytic domains were used also in the article by Razew et al. (Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706 (2023)). They showed that mature lysostaphin and lysostaphin catalytic domain hydrolysed the same Gly-Gly bonds.

      Moreover, full-length LytM is catalytically inactive. This is because the linker between its N-terminal and catalytic domains occludes the catalytic site (Odintsov et al. Latent LytM at 1.3 Å resolution. J. Mol. Biol. 225, 775 (2004)). LytM catalytic domain without its N-terminal segment is active (Odintsov et al (2004) and Firczuk et al. Crystal structure of active LytM. J. Mol. Biol 354, 578 (2005)).

      2) Impact and novelty:

      (1) the current study provided evidence suggesting the novel function of LytM in cleaving D-Ala-Gly. The impact of this finding is unclear. The manuscript discussed Enterococcus faecalis EnpA. But how about other M23 endopeptidases? What is biological relevance?

      EnpA was specifically mentioned because it has been reported to also cleave the D-Ala-Gly bond. Structural similarities between the enzymes could reveal the basis for this bond specificity. Moreover, the focus of the study was not to reveal the biological function of LytM but rather to understand which amino acid substitutions lead to differences in specificities in the two structurally very similar enzymes.

      (2) A very similar study published recently showed that the activity of LSS and LytM is regulated by PG cross-linking: LSS cleaves more cross-linked PG and LytM cleaves less cross-linked PG (Razew, A., Laguri, C., Vallet, A., et al. Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706 (2023). The results of this paper are different from the current study whereby both LSS and LytM prefer cross-linked substrates (Fig, 2JKL). Moreover, no D-Ala-Gly cleavage was observed by LytM using purified PG substrate from Razew A et al. An explanation of inconsistent results is needed here. In my opinion, the knowledge generated from the current study has not been fully settled. If the results can be validated, the contribution to the field is incremental, but not substantial.

      Another point raised by the Reviewer concerned the inconsistent results between our study and the recent paper by Razew et al. (2023) regarding LytM D-Ala--Gly cleavage. The explanation might lie in the type of NMR data acquired and its interpretation. We identified all hydrolysis products using 1H, 13C multiple bond correlation NMR spectra acquired from samples dissolved in deuterated buffers. Use of C-H signals is advantageous in that they are not prone to chemical exchange phenomena and enable unambiguous chemical shift assignment. Based on shown NMR spectra, Razew and co-workers identified cleaved muropeptide bonds by observing product glycine peaks in 1H, 15N correlation spectra, specifically amide peaks of product C-terminal glycines appearing in the 114-117 ppm 15N region of spectra of samples treated with LytM/LSS. D-Ala--Gly cleavage, however produces an N-terminal glycine, whose signal due to chemical exchange is not typically observed in regular N,H correlation spectra. Razew and co-workers validated their observations with UPLC-MS analysis. However, to our understanding, their data analysis was based on the assumption that LytM cleaves between Gly4-Gly5 (or Gly1-Gly2 using our numbering), and accordingly only masses corresponding to potential products containing 1 to 4 glycines anchored to the lysine side chain were considered.

      (3) The authors emphasized a few times in the text that it is superior to use NMR technology. In my opinion, NMR has certain advantages, such as measuring the efficacy of cleavage, but it is not that superior. It should be complementary to other methods such as mass spectrometry. In addition, more relevant solid-state NMR using intact PG or bacterial cells was not discussed in the study. I am of the opinion that the corresponding text should be revised.

      We value and agree with the Reviewer’s opinion that NMR spectroscopy is complementary to other methods e.g., mass spectrometry. However, in this particular case, NMR provided simultaneously information on reaction kinetics as well as scissile bonds in the substrates, which allowed us to compare rates of hydrolysis in different PG fragments and reshape the substrate specificities of LytM/LSS. We also agree that solid-state NMR is a wonderful technique. In our revised manuscript, we will edit the text accordingly.

      3) The conclusions are not fully supported by the data

      As mentioned above, the conclusions from synthesized peptide substrates may not necessarily reveal physiological functions. The conclusions need to be validated by more physiological substrates.

      As pointed out above in our response to the potential weaknesses of this study, the aim of this work was not to reveal the physiological function of LytM but to glean information on its substrate specificity that echoes its functional role in a substrate level. Hitherto LytM has been shown to cleave amide bonds between glycines without providing detailed information about the specific scissile bonds in the established PG components in S. aureus cell wall. The same holds true for lysostaphin as well. This study provides concomitantly information on the rates of hydrolysis and scissile bonds of these two enzymes. We deduced that LytM, and especially lysostaphin substrate specificity is defined by D-Ala-Gly cross-linking, which is a structural property, whereas Razew et al. (2023) discuss about “more cross-linked” and “less cross-linked PG”, which is a supramolecular asset or density.

      4) There are some issues with the presentation of the figures, text, and formatting.

      We are grateful to the Reviewer for bringing up issues in figures and text. We will address these in the revised version of the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This work investigates the enzymatic properties of lysostaphin (LSS) and LytM, two enzymes produced by Staphylococcus aureus and previously described as glycyl-glycyl endopeptidases. The authors use synthetic peptide substrates mimicking peptidoglycan fragments to determine the substrate specificity of both enzymes and identify the bonds they cleave.

      Strengths:

      • This work is addressing a real gap in our knowledge since very little information is available about the substrate specificity of peptidoglycan hydrolases.

      • The experimental strategy and its implementation are robust and provide a thorough analysis of LSS and LytM enzymatic activities. The results are very convincing and demonstrate that the enzymatic properties of the model enzymes studied need to be revisited.

      Weaknesses:

      • The manuscript is difficult to read in places and some figures are not always presented in a way that is easy to follow. This being said, the authors have made a good effort to present their experiments in an engaging manner. Some recommendations have been made to improve the current manuscript but these remain minor issues.

      We thank the Reviewer for providing positive feedback on our manuscript and for appreciating the systematic work behind this study which aims to unknot the substrate specificity of two S. aureus PG hydrolyzing enzymes. We are grateful for the comments aiming to improve the presentation of the current version of manuscript and we will take these into account while preparing the revised version of the manuscript.

    2. eLife assessment

      This manuscript describes a valuable study aimed at identifying the substrate specificity of two cell wall hydrolases LSS and LytM in S. aureus. The authors show that LytM has a novel function of cleaving D-Ala-Gly instead of only Gly-Gly by using synthetic substrates and convincing NMR-based real-time kinetics measurements. The biological relevance of the reported results will have to be investigated in future in vivo experiments.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript aimed at elucidating the substrate specificity of two M23 endopeptidase Lysostaphin (LSS) and LytM in S. aureus. Endopeptidases are known to cleave the glycine-bridges of staphylococcal cell wall peptidoglycan (PG). To address this question, various glycine-bridge peptides were synthesized as substrates, the catalytic domain of LSS and LytM were recombinantly expressed and purified, and the reactions were analyzed using solution-state NMR. The major finding is that LytM is not only a Gly-Gly endopeptidase, but also cleaves D-Ala-Gly. Technically, the advantage of using real-time NMR was emphasized in the manuscript. The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM. However, the biological significance and relevance of the conclusions remain clear, as the results are mostly from synthetic substrates.

      Strengths:<br /> The study explores an interesting aspect of cell wall hydrolases in terms of substrate-level regulation. It potentially identified new enzymatic activity of LytM.

      Weaknesses:<br /> 1. Significance: while the current study provided a detailed analysis of various substrates, the conclusions are mainly based on synthesized peptides. One experiment used purified muropeptides (Fig. 3H); however, the results were unclear from this figure. The results from synthesized peptides may not necessarily correlate with their biological functions in vivo. Secondly, the study used only the catalytic domain of both proteins. It is known that the substrate specificity of these enzymes is regulated by their substrate-binding domains. There is no mention of other domains in the manuscript and no justification of why only the catalytic domain was studied. In short, the relevance of the results from the current study to the enzymes' actual physiological functions remains to be addressed, which attenuated the significance of the study.

      2. Impact and novelty: (1) the current study provided evidence suggesting the novel function of LytM in cleaving D-Ala-Gly. The impact of this finding is unclear. The manuscript discussed Enterococcus faecalis EnpA. But how about other M23 endopeptidases? What is biological relevance? (2) A very similar study published recently showed that the activity of LSS and LytM is regulated by PG cross-linking: LSS cleaves more cross-linked PG and LytM cleaves less cross-linked PG (Razew, A., Laguri, C., Vallet, A., et al. Staphylococcus aureus sacculus mediates activities of M23 hydrolases. Nat Commun 14, 6706 (2023). The results of this paper are different from the current study whereby both LSS and LytM prefer cross-linked substrates (Fig, 2JKL). Moreover, no D-Ala-Gly cleavage was observed by LytM using purified PG substrate from Razew A et al. An explanation of inconsistent results is needed here. In my opinion, the knowledge generated from the current study has not been fully settled. If the results can be validated, the contribution to the field is incremental, but not substantial. (3) The authors emphasized a few times in the text that it is superior to use NMR technology. In my opinion, NMR has certain advantages, such as measuring the efficacy of cleavage, but it is not that superior. It should be complementary to other methods such as mass spectrometry. In addition, more relevant solid-state NMR using intact PG or bacterial cells was not discussed in the study. I am of the opinion that the corresponding text should be revised.

      3. The conclusions are not fully supported by the data<br /> As mentioned above, the conclusions from synthesized peptide substrates may not necessarily reveal physiological functions. The conclusions need to be validated by more physiological substrates.

      4. There are some issues with the presentation of the figures, text, and formatting.

    4. Reviewer #2 (Public Review):

      Summary:<br /> This work investigates the enzymatic properties of lysostaphin (LSS) and LytM, two enzymes produced by Staphylococcus aureus and previously described as glycyl-glycyl endopeptidases. The authors use synthetic peptide substrates mimicking peptidoglycan fragments to determine the substrate specificity of both enzymes and identify the bonds they cleave.

      Strengths:<br /> - This work is addressing a real gap in our knowledge since very little information is available about the substrate specificity of peptidoglycan hydrolases.<br /> - The experimental strategy and its implementation are robust and provide a thorough analysis of LSS and LytM enzymatic activities. The results are very convincing and demonstrate that the enzymatic properties of the model enzymes studied need to be revisited.

      Weaknesses:<br /> - The manuscript is difficult to read in places and some figures are not always presented in a way that is easy to follow. This being said, the authors have made a good effort to present their experiments in an engaging manner. Some recommendations have been made to improve the current manuscript but these remain minor issues.

    1. Author Response

      We would like to thank the senior editor, reviewing editor and all the reviewers for taking out precious time to review our manuscript and appreciating our study. We are excited that all of you have found strength in our work and have provided comments to strengthen it further. We sincerely appreciate the valuable comments and suggestions, which we believe will help us to further improve the quality of our work.

      Reviewer 1

      The manuscript by Dubey et al. examines the function of the acetyltransferase Tip60. The authors show that (auto)acetylation of a lysine residue in Tip60 is important for its nuclear localization and liquid-liquid-phase-separation (LLPS). The main observations are: (i) Tip60 is localized to the nucleus, where it typically forms punctate foci. (ii) An intrinsically disordered region (IDR) within Tip60 is critical for the normal distribution of Tip60. (iii) Within the IDR the authors show that a lysine residue (K187), that is auto-acetylated, is critical. Mutation of that lysine residue to a non-acetylable arginine abolishes the behavior. (iv) biochemical experiments show that the formation of the punctate foci may be consistent with LLPS.

      On balance, this is an interesting study that describes the role of acetylation of Tip60 in controlling its biochemical behavior as well as its localization and function in cells. The authors mention in their Discussion section other examples showing that acetylation can change the behavior of proteins with respect to LLPS; depending on the specific context, acetylation can promote (as here for Tip60) or impair LLPS.

      Strengths:

      The experiments are largely convincing and appear to be well executed.

      Weaknesses:

      The main concern I have is that all in vivo (i.e. in cells) experiments are done with overexpression in Cos-1 cells, in the presence of the endogenous protein. No attempt is made to use e.g. cells that would be KO for Tip60 in order to have a cleaner system or to look at the endogenous protein. It would be reassuring to know that what the authors observe with highly overexpressed proteins also takes place with endogenous proteins.

      Response: The main reason to perform these experiments with overexpression system was to generate different point mutants and deletion mutants of TIP60 and analyse their effect on its properties and functions. To validate our observations with overexpression system, we also examined localization pattern of endogenous TIP60 by IFA and results depict similar kind of foci pattern within the nucleus as observed with overexpressed TIP60 protein (Figure 4A). However, we understand the reviewers concern and agree to repeat some of the overexpression experiments under endogenous TIP60 knockdown conditions using siRNA or shRNA against 3’ UTR region.

      Also, it is not clear how often the experiments have been repeated and additional quantifications (e.g. of western blots) would be useful.

      Response: The experiments were performed as independent biological replicates (n=3) and this is mentioned in the figure legends. Regarding the suggestion for quantifying Western blots, we want to bring into the notice that where ever required (for blots such as Figure 2F, 6H) that require quantitative estimation, graph representing quantitated value with p-value had already been added. However as suggested, in addition, quantitation for Figure 6D will be performed and added in the revised version.

      In addition, regarding the LLPS description (Figure 1), it would be important to show the wetting behaviour and the temperature-dependent reversibility of the droplet formation.

      Response: We appreciate the suggestion, and we will perform these assays and include the results in the revised version.

      In Fig 3C the mutant (K187R) Tip60 is cytoplasmic, but still appears to form foci. Is this still reflecting phase separation, or some form of aggregation?

      Response: TIP60 (K187R) mutant remains cytosolic with homogenous distribution as shown in Figure 2E. Also with TIP60 partners like PXR or p53, this mutant protein remains homogenously distributed in the cytosol. However, when co-expressed with TIP60 (Wild-type) protein, this mutant protein although still remain cytosolic some foci-like pattern is also observed at the nuclear periphery which we believe could be accumulated aggregates.

      Reviewer 2

      The manuscript "Autoacetylation-mediated phase separation of TIP60 is critical for its functions" by Dubey S. et al reported that the acetyltransferase TIP60 undergoes phase separation in vitro and cell nuclei. The intrinsically disordered region (IDR) of TIP60, particularly K187 within the IDR, is critical for phase separation and nuclear import. The authors showed that K187 is autoacetylated, which is important for TIP60 nuclear localization and activity on histone H4. The authors did several experiments to examine the function of K187R mutants including chromatin binding, oligomerization, phase separation, and nuclear foci formation. However, the physiological relevance of these experiments is not clear since TIP60 K187R mutants do not get into nuclei. The authors also functionally tested the cancer-derived R188P mutant, which mimics K187R in nuclear localization, disruption of wound healing, and DNA damage repair. However, similar to K187R, the R188P mutant is also deficient in nuclear import, and therefore, its defects cannot be directly attributed to the disruption of the phase separation property of TIP60. The main deficiency of the manuscript is the lack of support for the conclusion that "autoacetylation-mediated phase separation of TIP60 is critical for its functions".

      This study offers some intriguing observations. However, the evidence supporting the primary conclusion, specifically regarding the necessity of the intrinsically disordered region (IDR) and K187ac of TIP60 for its phase separation and function in cells, lacks sufficient support and warrants more scrutiny. Additionally, certain aspects of the experimental design are perplexing and lack controls to exclude alternative interpretations. The manuscript can benefit from additional editing and proofreading to improve clarity.

      Response: We understand the point raised by the reviewer, however we would like to draw his attention to the data where we clearly demonstrated that acetylation of lysine 187 within the IDR of TIP60 is required for its phase separation (Figure 2J). We would like to draw reviewer’s attention to other TIP60 mutants within IDR (R177H, R188H, K189R) which all enters the nucleus and make phase separated foci. Cancer-associated mutation at R188 behaves similarly because it also hampers TIP60 acetylation at the adjacent K187 residue. Our in vitro and in cellulo results clearly demonstrate that autoacetylation of TIP60 at K187 within its IDR is critical for multiple functions including its translocation inside the nucleus, its protein-protein interaction and oligomerization which are prerequisite for phase separation of TIP60.

      There are two putative NLS sequences (NLS #1 from aa145; NLS #2 from aa184) in TIP60, both of which are within the IDR. Deletion of the whole IDR is therefore expected to abolish the nuclear localization of TIP60. Since K187 is within NLS #2, the cytoplasmic localization of the IDR and K187R mutants may not be related to the ability of TIP60 to phase separation.

      Response: We are not disputing the presence of putative NLS within IDR region of TIP60, however our results through different mutations within IDR region (K76, K80, K148, K150, R177, R178, R188, K189) clearly demonstrate that only K187 residue acetylation is critical to shuttle TIP60 inside the nucleus while all other lysine mutants located within these putative NLS region exhibited no impact on TIP60’s nuclear shuttling. We have mentioned this in our discussion, that autoacetylation of TIP60’s K187 may induce local structural modifications in its IDR which is critical for translocating TIP60 inside the nucleus where it undergoes phase separation critical for its functions. A previous example of similar kind shows, acetylation of lysine within the NLS region of TyrRS by PCAF promote its nuclear localization (Cao X et al 2017, PNAS). IDR region (which also contains K187 site) is important for phase separation once the protein enters inside the nucleus. This could be the cell’s mechanism to prevent unwarranted action of TIP60 until it enters the nucleus and phase separate on chromatin at appropriate locations.

      The chromatin-binding activity of TIP60 depends on HAT activity, but not phase-separation (Fig 1I), (Fig 2B). How do the authors reconcile the fact that the K187R mutant is able to bind to chromatin with lower activity than the HAT mutant (Fig 2F, 2I)?

      Response: K187 acetylation is required for TIP60’s nuclear translocation but not critical for chromatin binding. When soluble fraction is prepared in fractionation experiment, nuclear membrane is disrupted and TIP60 (K187R) mutant has no longer hindrance in accessing the chromatin and thus can load on the chromatin (although not as efficient as Wild-type protein). For efficient chromatin binding auto-acetylation of other lysine residues in TIP60 is required which might be hampered due to reduced catalytic activity or not sufficient enough to maintain equilibrium with HDAC’s activity inside the nucleus. In case of K187R, the reduced auto-acetylation is captured when protein is the cytosol. During fractionation, once this mutant has access to chromatin, it might auto-acetylate other lysine residues critical for chromatin loading (remember catalytic domain is intact in this mutant). This is evident due to hyper auto-acetylation of Wild-type protein compared to K187R or HAT mutant proteins. We want to bring into notice that phase-separation occurs only after efficient chromatin loading of TIP60 that is the reason that under in-cellulo conditions, both K187R (which cannot enter the nucleus) and HAT mutant (which enters the nucleus but fails to efficiently binds onto the chromatin) fails to form phase separated nuclear punctate foci.

      The DIC images of phase separation in Fig 2I need to be improved. The image for K187R showed the irregular shape of the condensates, which suggests particles in solution or on the slide. The authors may need to use fluorescent-tagged TIP60 in the in vitro LLPS experiments.

      Response: We believe this comment is for figure 2J. The irregularly shaped condensates observed for TIP60 K187R are unique to the mutant protein and are not caused by particles on the slide. We would like to draw reviewer’s attention to supplementary figure S2A, where DIC images for TIP60 (Wild-type) protein tested under different protein and PEG8000 conditions are completely clear where protein did not made phase separated droplets ruling out the probability of particles in solution or slides.

      The authors mentioned that the HAT mutant of TIP60 does not phase separate, which needs to be included.

      Response: We have already added the image of RFP-TIP60 (HAT mutant) in supplementary Fig S4A (panel 2) in the manuscript.

      Related to Point 3, the HAT mutant that doesn't form punctate foci by itself, can incorporate into WT TIP60 (Fig 5A). In vitro LLPS assay for WT, HAT, and K187R mutants with or without acetylation should be included. WT and mutant TIP can be labelled with GFP and RFP, respectively.

      Response: We would like to draw reviewer’s attention towards our co-expression experiments performed in Figure 5 where Wild-type protein (both tagged and untagged condition) is able to phase separate and make punctate foci with co-expressed HAT mutant protein (with depleted autoacetylation capacity). We believe these in cellulo experiments are already able to answer the queries what reviewer is suggesting to acheive by in vitro experiments.

      Fig 3A and 3B showed that neither K187 mutant nor HAT mutant could oligomerize. If both experiments were conducted in the absence of in vitro acetylation, how do the authors reconcile these results?

      Response: We thank the reviewer for highlighting our oversight in omitting the mention of acetyl coenzyme A here. To induce acetylation under in vitro conditions, we have added 10 µM acetyl CoA into the reactions depicted in Figure 3A and 3B. The information for acetyl CoA for Figure 3B was already included in the GST-pull down assay (material and methods section). We will add the same in the oligomerization assay of material and methods in the revised manuscript.

      In Fig 4, the colocalization images showed little overlap between TIP60 and nuclear speckle (NS) marker SC35, indicating that the majority of TIP60 localized in the nuclear structure other than NS. Have the authors tried to perturbate the NS by depleting the NS scaffold protein and examining TIP60 foci formation? Do PXR and TP53 localize to NS?

      Response: Under normal conditions majority of TIP60 is not localized in nuclear speckles (NS) so we believe that perturbing NS will not have significant effect on TIP60 foci formation. Interestingly, recently a study by Shelly Burger group (Alexander KA et al Mol Cell. 2021 15;81(8):1666-1681) had shown that p53 localizes to NS to regulate subset of its targeted genes. We have mentioned about it in our discussion section. No information is available about localization of PXR in NS.

      Were TIP60 substrates, H4 (or NCP), PXR, TP53, present inTIP60 condensates in vitro? It's interesting to see both PXR and TP53 had homogenous nuclear signals when expressed together with K187R, R188P (Fig 6E, 6G), or HAT (Suppl Fig S4A) mutants. Are PXR or TP53 nuclear foci dependent on their acetylation by TIP60? This can and should be tested.

      Response: Both p53 and PXR are known to be acetylated by TIP60. In case of PXR, TIP60 acetylate PXR at lysine 170 and this TIP60-mediated acetylation of PXR at K170 is important for TIP60-PXR foci which now we know are formed by phase separation (Bakshi K et al Sci Rep. 2017 Jun 16;7(1):3635).

      Since R188P mutant, like K187R, does not get into the nuclei, it is not suitable to use this mutant to examine the functional relevance of phase separation for TIP60. The authors need to find another mutant in IDR that retains nuclear localization and overall HAT activity but specifically disrupts phase separation. Otherwise, the conclusion needs to be restated. All cancer-derived mutants need to be tested for LLPS in vitro.

      Response: We appreciate the reviewer’s point here, but it is important to note that the objective of these experiments is to understand the impact of K187R (critical in multiple aspects of TIP60 including phase separation) and R188P (a naturally occurring cancer-associated mutation and behaving similarly to K187R) on TIP60’s activities to determine their functional relevance. As suggested by the reviewer to test and find IDR mutant that fails to phase separate however retains nuclear localization and catalytic activity can be examined in future studies.

      For all cellular experiments, it is not mentioned whether endogenous TIP60 was removed and absent in the cell lines used in this study. It's important to clarify this point because the localization and function of mutant TIP60 are affected by WT TIP60 (Fig 5).

      Response: Endogenous TIP60 was present in in cellulo experiments, however as suggested by reviewer 1 we will perform some of the in cellulo experiments under endogenous TIP60 knockdown condition to validate our findings.

      It is troubling that H4 peptide is used for in vitro HAT assay since TIP60 has much higher activity on nucleosomes and its preferred substrates include H2A.

      Response: The purpose of using H4 peptide in the HAT assay is to determine the impact of mutations of TIP60’s catalytic activity. As H4 is one of the major histone substrate for TIP60, we believe it satisfy the objective of experiments.

      Reviewer 3

      This study presents results arguing that the mammalian acetyltransferase Tip60/KAT5 auto-acetylates itself on one specific lysine residue before the MYST domain, which in turn favors not only nuclear localization but also condensate formation on chromatin through LLPS. The authors further argue that this modification is responsible for the bulk of Tip60 autoacetylation and acetyltransferase activity towards histone H4. Finally, they suggest that it is required for association with txn factors and in vivo function in gene regulation and DNA damage response.

      These are very wide and important claims and, while some results are interesting and intriguing, there is not really close to enough work performed/data presented to support them. In addition, some results are redundant between them, lack consistency in the mutants analyzed, and show contradiction between them. The most important shortcoming of the study is the fact that every single experiment in cells was done in over-expressed conditions, from transiently transfected cells. It is well known that these conditions can lead to non-specific mass effects, cellular localization not reflecting native conditions, and disruption of native interactome. On that topic, it is quite striking that the authors completely ignore the fact that Tip60 is exclusively found as part of a stable large multi-subunit complex in vivo, with more than 15 different proteins. Thus, arguing for a single residue acetylation regulating condensate formation and most Tip60 functions while ignoring native conditions (and the fact that Tip60 cannot function outside its native complex) does not allow me to support this study.

      Response: We appreciate the reviewer’s point here, but it is important to note that the main purpose to use overexpression system in the study is to analyse the effect of different generated point/deletion mutations on TIP60. We have overexpressed proteins with different tags (GFP or RFP) or without tags (Figure 3C, Figure 5) to confirm the behaviour of protein which remains unperturbed due to presence of tags. To validate we have also examined localization of endogenous TIP60 protein which also depict similar localization behaviour as overexpressed protein. We would like to draw attention that there are several reports in literature where similar kind of overexpression system are used to determine functions of TIP60 and its mutants. Also nuclear foci pattern observed for TIP60 in our studies is also reported by several other groups.

      Sun, Y., et. al. (2005) A role for the Tip60 histone acetyltransferase in the acetylation and activation of ATM. Proc Natl Acad Sci U S A, 102(37):13182-7.

      Kim, C.-H. et al. (2015) ‘The chromodomain-containing histone acetyltransferase TIP60 acts as a code reader, recognizing the epigenetic codes for initiating transcription’, Bioscience, Biotechnology, and Biochemistry, 79(4), pp. 532–538.

      Wee, C. L. et al. (2014) ‘Nuclear Arc Interacts with the Histone Acetyltransferase Tip60 to Modify H4K12 Acetylation(1,2,3).’, eNeuro, 1(1). doi: 10.1523/ENEURO.0019-14.2014.

      However, as a caution and suggested by other reviewers also we will perform some of these overexpression experiments in absence of endogenous TIP60 by using 3’ UTR specific siRNA/shRNA.

      We thank the reviewer for his comment on muti-subunit complex proteins and we would like to expand our study by determining the interaction of some of the complex subunits with TIP60 ((Wild-type) that forms nuclear condensates), TIP60 ((HAT mutant) that enters the nucleus but do not form condensates) and TIP60 ((K187R) that do not enter the nucleus and do not form condensates). We will include the result of these experiments in the revised manuscript.

      • It is known that over-expression after transient transfection can lead to non-specific acetylation of lysines on the proteins, likely in part to protect from proteasome-mediated degradation. It is not clear whether the Kac sites targeted in the experiments are based on published/public data. In that sense, it is surprising that the K327R mutant does not behave like a HAT-dead mutant (which is what exactly?) or the K187R mutant as this site needs to be auto-acetylated to free the catalytic pocket, so essential for acetyltransferase activity like in all MYST-family HATs. In addition, the effect of K187R on the total acetyl-lysine signal of Tip60 is very surprising as this site does not seem to be a dominant one in public databases.

      Response: We have chosen autoacetylation sites based on previously published studies where LC-MS/MS and in vitro acetylation assays were used to identified autoacetylation sites in TIP60 which includes K187. We have already mentioned about it in the manuscript and have quoted the references (1. Yang, C., et al (2012). Function of the active site lysine autoacetylation in Tip60 catalysis. PloS one 7, e32886. 10.1371/journal.pone.0032886. 2. Yi, J., et al (2014). Regulation of histone acetyltransferase TIP60 function by histone deacetylase 3. The Journal of biological chemistry 289, 33878–33886. 10.1074/jbc.M114.575266.). We would like to emphasize that both these studies have identified K187 as autoacetylation site in TIP60. Since TIP60 HAT mutant (with significantly reduced catalytic activity) can also enter nucleus, it is not surprising that K327 could also enter the nucleus.

      • As the physiological relevance of the results is not clear, the mutants need to be analyzed at the native level of expression to study real functional effects on transcription and localization (ChIP/IF). It is not clear the claim that Tip60 forms nuclear foci/punctate signals at physiological levels is based on what. This is certainly debated because in part of the poor choice of antibodies available for IF analysis. In that sense, it is not clear which Ab is used in the Westerns. Endogenous Tip60 is known to be expressed in multiple isoforms from splice variants, the most dominant one being isoform 2 (PLIP) which lacks a big part (aa96-147) of the so-called IDR domain presented in the study. Does this major isoform behave the same?

      Response: TIP60 antibody used in the study is from Santa Cruz (Cat. No.- sc-166323). This antibody is widely used for TIP60 detection by several methods and has been cited in numerous publications. Cat. No. will be mentioned in the manuscript. Regarding isoforms, three isoforms are known for TIP60 among which isoform 2 is majorly expressed and used in our study. Isoform and 1 and 2 have same length of IDR (150 amino acids) while isoform 3 has IDR of 97 amino acids. Interestingly, the K187 is present in all the isoforms (already mentioned in the manuscript) and missing region (96-147 amino acid) in isoform 3 has less propensity for disordered region (marked in blue circle). This clearly shows that all the isoforms of TIP60 has the tendency to phase separate.

      Author response image 1.

      • It is extremely strange to show that the K187R mutant fails to get in the nuclei by cell imaging but remains chromatin-bound by fractionation... If K187 is auto-acetylated and required to enter the nucleus, why would a HAT-dead mutant not behave the same?

      Response: We would like to draw attention that both HAT mutant and K187R mutant are not completely catalytically dead. As our data shows both these mutants have catalytic activity although at significantly decreased levels. We believe that K187 acetylation is critical for TIP60 to enter the nucleus and once TIP60 shuttles inside the nucleus autoacetylation of other sites is required for efficient chromatin binding of TIP60. In fractionation assay, nuclear membrane is dissolved while preparing the soluble fraction so there is no hindrance for K187R mutant in accessing the chromatin. While in the case of HAT mutant, it can acetylate the K187 site and thus is able to enter the nucleus however this residual catalytic activity is either not able to autoacetylate other residues required for its efficient chromatin binding or to counter activities of HDAC’s deacetylating the TIP60.

      • If K187 acetylation is key to Tip60 function, it would be most logical (and classical) to test a K187Q acetyl-mimic substitution. In that sense, what happens with the R188Q mutant? That all goes back to the fact that this cluster of basic residues looks quite like an NLS.

      Response: As suggested we will generate acetylation mimicking mutant for K187 site and examine it. Result will be added in the revised manuscript.

      • The effect of the mutant on the TIP60 complex itself needs to be analyzed, e.g. for associated subunits like p400, ING3, TRRAP, Brd8...

      Response: As suggested we will examine the effect of mutations on TIP60 complex

    2. eLife assessment

      This is a valuable study on K187 acetylation of the nuclear protein, TIP60, required for its phase separation and function. The evidence supporting the primary conclusion is incomplete and warrants more scrutiny.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript by Dubey et al. examines the function of the acetyltransferase Tip60. The authors show that (auto)acetylation of a lysine residue in Tip60 is important for its nuclear localization and liquid-liquid-phase-separation (LLPS).

      The main observations are: (i) Tip60 is localized to the nucleus, where it typically forms punctate foci. (ii) An intrinsically disordered region (IDR) within Tip60 is critical for the normal distribution of Tip60. (iii) Within the IDR the authors show that a lysine residue (K187), that is auto-acetylated, is critical. Mutation of that lysine residue to a non-acetylable arginine abolishes the behavior. (iv) biochemical experiments show that the formation of the punctate foci may be consistent with LLPS.

      Strengths:<br /> The experiments are largely convincing and appear to be well executed.

      Weaknesses:<br /> The main concern I have is that all in vivo (i.e. in cells) experiments are done with overexpression in Cos-1 cells, in the presence of the endogenous protein. No attempt is made to use e.g. cells that would be KO for Tip60 in order to have a cleaner system or to look at the endogenous protein. It would be reassuring to know that what the authors observe with highly overexpressed proteins also takes place with endogenous proteins.

      Also, it is not clear how often the experiments have been repeated and additional quantifications (e.g. of western blots) would be useful.

      In addition, regarding the LLPS description (Figure 1), it would be important to show the wetting behavior and the temperature-dependent reversibility of the droplet formation.

      On balance, this is an interesting study that describes the role of acetylation of Tip60 in controlling its biochemical behavior as well as its localization and function in cells. The authors mention in their Discussion section other examples showing that acetylation can change the behavior of proteins with respect to LLPS; depending on the specific context, acetylation can promote (as here for Tip60) or impair LLPS.

    4. Reviewer #2 (Public Review):

      The manuscript "Autoacetylation-mediated phase separation of TIP60 is critical for its functions" by Dubey S. et al reported that the acetyltransferase TIP60 undergoes phase separation in vitro and cell nuclei. The intrinsically disordered region (IDR) of TIP60, particularly K187 within the IDR, is critical for phase separation and nuclear import. The authors showed that K187 is autoacetylated, which is important for TIP60 nuclear localization and activity on histone H4. The authors did several experiments to examine the function of K187R mutants including chromatin binding, oligomerization, phase separation, and nuclear foci formation. However, the physiological relevance of these experiments is not clear since TIP60 K187R mutants do not get into nuclei. The authors also functionally tested the cancer-derived R188P mutant, which mimics K187R in nuclear localization, disruption of wound healing, and DNA damage repair. However, similar to K187R, the R188P mutant is also deficient in nuclear import, and therefore, its defects cannot be directly attributed to the disruption of the phase separation property of TIP60. The main deficiency of the manuscript is the lack of support for the conclusion that "autoacetylation-mediated phase separation of TIP60 is critical for its functions".

      This study offers some intriguing observations. However, the evidence supporting the primary conclusion, specifically regarding the necessity of the intrinsically disordered region (IDR) and K187ac of TIP60 for its phase separation and function in cells, lacks sufficient support and warrants more scrutiny. Additionally, certain aspects of the experimental design are perplexing and lack controls to exclude alternative interpretations. The manuscript can benefit from additional editing and proofreading to improve clarity.

    5. Reviewer #3 (Public Review):

      This study presents results arguing that the mammalian acetyltransferase Tip60/KAT5 auto-acetylates itself on one specific lysine residue before the MYST domain, which in turn favors not only nuclear localization but also condensate formation on chromatin through LLPS. The authors further argue that this modification is responsible for the bulk of Tip60 autoacetylation and acetyltransferase activity towards histone H4. Finally, they suggest that it is required for association with txn factors and in vivo function in gene regulation and DNA damage response.

      These are very wide and important claims and, while some results are interesting and intriguing, there is not really close to enough work performed/data presented to support them. In addition, some results are redundant between them, lack consistency in the mutants analyzed, and show contradiction between them. The most important shortcoming of the study is the fact that every single experiment in cells was done in over-expressed conditions, from transiently transfected cells. It is well known that these conditions can lead to non-specific mass effects, cellular localization not reflecting native conditions, and disruption of native interactome. On that topic, it is quite striking that the authors completely ignore the fact that Tip60 is exclusively found as part of a stable large multi-subunit complex in vivo, with more than 15 different proteins. Thus, arguing for a single residue acetylation regulating condensate formation and most Tip60 functions while ignoring native conditions (and the fact that Tip60 cannot function outside its native complex) does not allow me to support this study.

      Specific points:<br /> -It is known that over-expression after transient transfection can lead to non-specific acetylation of lysines on the proteins, likely in part to protect from proteasome-mediated degradation. It is not clear whether the Kac sites targeted in the experiments are based on published/public data. In that sense, it is surprising that the K327R mutant does not behave like a HAT-dead mutant (which is what exactly?) or the K187R mutant as this site needs to be auto-acetylated to free the catalytic pocket, so essential for acetyltransferase activity like in all MYST-family HATs. In addition, the effect of K187R on the total acetyl-lysine signal of Tip60 is very surprising as this site does not seem to be a dominant one in public databases.

      -As the physiological relevance of the results is not clear, the mutants need to be analyzed at the native level of expression to study real functional effects on transcription and localization (ChIP/IF). It is not clear the claim that Tip60 forms nuclear foci/punctate signals at physiological levels is based on what. This is certainly debated because in part of the poor choice of antibodies available for IF analysis. In that sense, it is not clear which Ab is used in the Westerns. Endogenous Tip60 is known to be expressed in multiple isoforms from splice variants, the most dominant one being isoform 2 (PLIP) which lacks a big part (aa96-147) of the so-called IDR domain presented in the study. Does this major isoform behave the same?

      -It is extremely strange to show that the K187R mutant fails to get in the nuclei by cell imaging but remains chromatin-bound by fractionation... If K187 is auto-acetylated and required to enter the nucleus, why would a HAT-dead mutant not behave the same?

      -If K187 acetylation is key to Tip60 function, it would be most logical (and classical) to test a K187Q acetyl-mimic substitution. In that sense, what happens with the R188Q mutant? That all goes back to the fact that this cluster of basic residues looks quite like an NLS.

      -The effect of the mutant on the TIP60 complex itself needs to be analyzed, e.g. for associated subunits like p400, ING3, TRRAP, Brd8...

      -The discussion is excessively long without addressing the obvious questions mentioned above.

    1. Author Response

      Reviewer #1 (Public Review):

      “A sample size of 3 idiopathic seems underpowered relative to the many types of genetic changes that can occur in ASD. Since the authors carried out WGS, it would be useful to know what potential causative variants were found in these 3 individuals and even if not overlapping if they might expect to be in a similar biological pathway.

      If the authors randomly selected 3 more idiopathic cell lines from individuals with autism, would these cell lines also have altered mTOR signaling? And could a line have the same cell biology defects without a change in mTOR signaling? The authors argue that the sample size could be the reason for lack of overlap of the proteomic changes (unlike the phosphor-proteomic overlaps), which makes the overlapping cell biology findings even more remarkable. Or is the phenotyping simply too crude to know if the phenotypes truly are the same?”

      We appreciate these thoughtful comments and also agree that of several models, our studies indicate the possibility of mTOR alteration in multiple forms of ASD. As above, we are currently pursuing this hypothesis with newly acquired DOD support. With regard to the I-ASD population, we agree that there are a large variety of genetic changes that can occur in genetically undefined ASDs. Indeed, this is precisely why we expected to see “personalized” phenotypes in each I-ASD individual when we embarked on this study. At that time, several years ago, we had planned to expand the analyses to more I-ASD individuals to assess for additional personalized phenotypes. However, as our studies progressed, we were surprised to find convergence in our I-ASD population in terms of neurite outgrowth and migration and later proteomic results showing convergence in mTOR. We found it particularly remarkable that despite a sample size of 3 that this convergence was noted. When we had the opportunity to extend our studies to the 16p11.2 deletion population, we were thrilled to conduct the first comparison between I-ASD and a genetically defined ASD and, as such, the scope of the paper turned towards this comparison. We do agree that analyses of the other I-ASD individuals would be a beneficial endeavor, both to understand how pervasive NPC migration and neurite deficits are in autism and to assess the presence of mTOR dysregulation. Furthermore, it would be important to see whether alterations in other pathways could also lead to similar cell biological deficits, though we know that other studies of neurodevelopmental disorders have found such cellular dysregulations without reporting concurrent mTOR dysregulation. Given our current grant funding to extend these analyses, such experiments within this manuscript would not be feasible.

      Regarding the phenotyping methods used, we decided to assess neurite outgrowth and migration as they are both cytoskeleton dependent processes that are critical for neurodevelopment and are often regulated by the same genes. Furthermore, similar analyses have been applied to Fragile-X Syndrome, 22q11.2 deletion syndrome, and schizophrenia NPCs (Shcheglovitov A. et al., 2013; Mor-Shaked H. et al., 2016; Urbach A. et al., 2010; Kelley D. J. et al., 2008; Doers M. E. et al., 2014; Brennand K. et al., 2015; Lee I. S. et al., 2015; Marchetto M. C. et al., 2011). As such, it seems that multiple underlying etiologies can lead to similar dysregulated cellular phenotypes that can contribute to a variety of neurodevelopmental disorders. On a more global level, there are only a few different cellular functions a developing neuron can undergo, and these include processes such as proliferation, survival, migration, and differentiation. Thus, to understand neurodevelopmental disorders, it is important to study the more “crude” or “global” cellular functions occurring during neurodevelopment to determine whether they are disrupted in disorders such as ASD. In our studies we find that there are indeed dysregulations in many of these basic developmental processes, indicating that the typical steps that occur for normal brain cytoarchitecture may be disrupted in ASD. To understand why, we then further utilized molecular studies to “zoom” in on potential mechanisms which implicated common dysregulation in mTOR signaling as one driver for these common cellular phenotypes. As suggested, we did complete WGS on all the I-ASD individuals and did not see any overlapping genetic variants between the three I-ASD individuals as mentioned in our manuscript. The genetic data was published in a larger manuscript incorporating the data (Zhou A. et al., 2023). However, there were variants that were unique to each I-ASD individual which were not seen in their unaffected family members, and it is possible these variants could be contributing to the I-ASD phenotypes. We also utilized IPA to conduct pathway analysis on the WGS data utilizing the same approach we did in analysis of p- proteome and proteome data. From WGS data, we selected high read-quality variants that were found only in I-ASD individuals and had a functional impact on protein (ie excluding synonymous variants). The enriched pathways obtained from this data were strikingly different from the pathways we found in the p-proteome analysis and are now included in supplemental Figure 6 in the manuscript. Briefly, the top 5 enriched pathways were: O-linked glycosylation, MHC class 1 signaling, Interleukin signaling, Antigen presentation, and regulation of transcription.

      Reviewer #2 (Public Review):

      1) I found that interpreting how differential EF sensitivity is connected to the rest of the story difficult at times. First, it is unclear why these extracellular factors were picked. These are seemingly different in nature (a neuropeptide, a growth factor and a neuromodulator) targeting largely different pathways. This limits the interpretation of the ASD subtype-specific rescue results. One way of reframing that could help is that these are pro-migratory factors instead of EFs broadly defined that fail to promote migration in I-ASD lines due to a shared malfunctioning of the intracellular migration machinery or cell-cell interactions (possibly through tight junction signaling, Fig S2A). Yet, this doesn't explain the migration/neurite phenotypes in 16p11 lines where EF sensitivity is not altered, overall implying that divergent EF sensitivity independent of underlying mTOR state. What is the proposed model that connects all three findings (divergent EF sensitivity based on ASD subtypes, 2 mTOR classes, convergent cellular phenotypes)?

      We thank you for the kind assessment of our manuscript and for the thought-provoking questions posed. In terms of extracellular factors, for our study, we defined extracellular factor as any growth factor, amino acid, neurotransmitter, or neuropeptide found in the extracellular environment of the developing cells. The EFs utilized were selected due to their well-established role in regulation of early neurodevelopmental phenotypes, their expression during the “critical window” of mid-fetal development (as determined by Allan Brain Atlas), and in the case of 5-HT, its association with ASD (Abdulamir H. A. et al., 2018; Adamsen D. et al., 2014; Bonnin A. et al., 2011; Bonnin A. et al., 2007; Chen X. et al., 2015; El Marroun H. et al., 2014; Hammock E. et al., 2012; Yang C. J. et al., 2014; Dicicco-Bloom E. et al., 1998; Lu N. et al., 1998; Suh J. et al., 2001; Watanabe J. et al., 2016; Gilmore J. H. et al., 2003; Maisonpierre P. C. et al., 1990; Dincel N. et al., 2013; Levi- Montalcini R., 1987). Lastly, prior experiments in our lab with a mouse model of neurodevelopmental disorders, had shown atypical responses to EFs (IGF-1, FGF, PACAP). As such, when we first chose to use EFs in human NPCs we wanted to know 1) whether human NPCs even responded to these EFs, 2) whether EFs regulated neurite outgrowth and migration and 3) would there be a differential response in NPCs derived from those with ASD. Our studies were initiated on the I-ASD cohort and given the heterogeneity of ASD we had hypothesized we would get “personalized” neurite and migration phenotypes. Due to this reason, we also wanted to select multiple types of EFs that worked on different signaling pathways. Ultimately, instead of personalized phenotypes we found that all the I-ASD NPCs did not respond to any of the EFs tested whereas the 16p11.2 deletion NPCS did – this was therefore the only difference we found between these two “forms” of ASD. As noted, in I-ASD the lack of response to EFs can be ameliorated by modulating mTOR. However, in the 16p11.2 deletion, despite similar mTOR dysregulation as seen in I-ASD, there is no EF impairment. We do not have a cohesive model to explain why the 16pDel individuals differ from the I-ASD model other than to point to the p- proteomes which do show that the 16pDel NPCs are distinct from the I-ASD NPCs. It seems that mTOR alteration can contribute to impaired EF responsiveness in some NPCs but perhaps there is an additional defect that needs to be present in order for this defect to manifest, or that 16p11.2 deletion NPCs have specific compensatory features. For example, as noted in the thoughtful comment, the p-proteome canonical pathway analysis shows tight junction malfunction in I-ASD which is not present in the 16pDel NPCs and it could be the combination of mTOR dysregulation + dysregulated tight junction signaling that has led to lack of response to EFs in I-ASD. Regardless, we do not think the differences between two genetically distinct ASDs diminish the convergent mTOR results we have uncovered. That is, regardless of whatever defects are present in the ASD NPCs, we are able to rescue it with mTOR modulation which has fascinating implications for treatment and conceptualization for ASD. Lastly, we see our EF studies as an important inclusion as it shows that in some subtypes of ASD, lack of response to appropriate EFs could be contributing to neurodevelopmental abnormalities. Moreover, lack of response to these EFs could have implications for treatment of individuals with ASD (for example, SSRI are commonly used to treat co-morbid conditions in ASD but if an individual is unresponsive to 5- HT, perhaps this treatment is less effective). We have edited the manuscript to include an additional discussion section to address the EFs more thoroughly and have included a few extra sentences in the introduction as well!

      2) A similar bidirectional migration phenotype has been described in hiSPC-derived human cortical interneurons generated from individuals with Timothy Syndrome (Birey et al 2022, Cell Stem Cell). Here, authors show that the intracellular calcium influx that is excessive in Timothy Syndrome or pharmacologically dampened in controls results in similar migration phenotypes. Authors can consider referring to this report in support of the idea that bimodal perturbations of cardinal signaling pathways can converge upon common cellular migration deficits.

      We thank you for pointing out the similar migration phenotype in the Timothy Syndrome paper and have now cited it in our manuscript. We have also expanded on the concept of “too much or too little” of a particular signaling mechanism leading to common outcomes.

      3) Given that authors have access to 8 I-ASD hiPSC lines, it'd very informative to assay the mTOR state (e.g. pS6 westerns) in NPCs derived from all 8 lines instead of the 3 presented, even without assessing any additional cellular phenotypes, which authors have shown to be robust and consistent. This can help the readers better get a sense of the proportion of high mTOR vs low- mTOR classes in a larger cohort.

      We have already addressed this in response to reviewer 1 and the essential revisions section, providing our reasoning for not expanding the study to all 8 I-ASD individuals.

      4) Does the mTOR modulation rescue EF-specific responses to migration as well (Figure 7)

      We did not conduct sufficient replicates of the rescue EF specific responses to migration due to the time consuming and resource intensive nature of the neurosphere experiments. Unlike the neurite experiments, the neurosphere experiments require significantly more cells, more time, selection of neurospheres based on a size criterion, and then manual trace measurements. We did one experiment in Family-1 where we utilized MK-2206 to abolish the response of Sib NPCs to PACAP. Likewise, adding SC-79 to I-ASD-1 neurospheres allowed for response to PACAP.

      Author response image 1.

      Author response image 2.

      Reviewer #3: Public Review

      We appreciate the kind, detailed and very thorough review you provided for us!

      The results on the mTOR signaling pathway as a point of convergence in these particular ASD subtypes is interesting, but the discussion should address that this has been demonstrated for other autism syndromes, and in the present manuscript, there should be some recognition that other signaling pathways are also implicated as common factors between the ASD subtypes.

      With regards to the mTOR pathway, we had included the other ASD syndromes in which mTOR dysregulation has been seen including tuberous sclerosis, Cowden Syndrome, NF-1, as well as Fragile-X, Angelman, Rett and Phelan McDermid in the final paragraph of the discussion section “mTOR Signaling as a Point of Convergence in ASD”. We have now expanded our discussion to include that other signaling pathways such as MAPK, cyclins, WNT, and reelin which have also been implicated as common factors between the ASD subtypes.

      The conclusions of this paper are mostly well supported by data, but for the cell migration assay, it is not clear if the authors control for initial differences in the inner cell mass area of the neurospheres in control vs ASD samples, which would affect the measurement of migration.

      Thank you for this thoughtful comment! When we first started our migration data, inner cell mass size was indeed a major concern for which we controlled in our methods. First, when plating the neurospheres, we would only collect spheres when a majority of spheres were approximately a diameter of 100 um. Very large spheres often could not be imaged due to being out of focus and very small spheres would often disperse when plated. Thus, there were some constraints to the variability of inner cell mass size.

      Furthermore, when we initially collected data, we conducted a proof of principal test to see if initial inner cell mass area (henceforth referred to as initial sphere size or ISS) influenced migration data. To do so, we obtained migration and ISS data from each diagnosis (Sib, NIH, I-ASD, 16pASD). Then we utilized R studio to see if there is a relationship between Migration and ISS in each diagnosis category using the equation (lm(Migration~ISS, data=bydiagnosis). In this equation, lm indicates linear modeling and (~) is a term used to ascertain the relationship between Migration and ISS and the term data=bydiagnosis allows the data to be organized by diagnosis

      The results were expressed as R-squared values indicating the correlation between ISS and Migration for each diagnosis and the p-value showing statistical significance for each comparison. As shown in Author response table 1, for each data set, there is minimal correlation between Migration and ISS in each data set. Moreover, there are no statistically significant relationships between Migration and ISS indicating that initial sphere size DOES NOT influence migration data in any of our data-sets.

      Author response table 1.

      Lastly, utilizing R, we modeled what predicted migration would be like for Sib, NIH, I-ASD, and 16pASD if we accounted for ISS in each group. Raw migration data was then plotted against the predicted data as in Author response image 3.

      Author response image 3.

      As shown in the graph, there are no statistical differences between the raw migration data (the data that we actually measured in the dish) and the modeled data in which ISS is accounted for as a variable. As such, we chose not to normalize to or account for ISS in our other experiments. We have now included the above R studio analyses in our supplemental figures (Figure S1) as well.

      Also, in Fig 5 and 6, panels I and J omit the effects of drug on mTOR phosphorylation as shown for other conditions.

      Both SC-79 and MK2206 were selected in our experiments after thorough analysis of their effects on human epithelial cells and other cultured cells (citations in manuscript). However, initially, we did not know whether either of these drugs would modulate the mTOR pathway in human NPCs, thus, in Figures 5A,5D, 6A and 6D we chose to focus on two of our data-sets to establish the effect of these drugs in human NPCs. Our experiments in Family-1 and Family-2 showed us that SC-79 increases PS6 in human NPCs while MK-2206 downregulates it. Once this was established, we knew the drugs would have similar effects in the NPCs from the other families. Thus, we only conducted a proof of principle test to confirm the drug does indeed have the intended effect in I-ASD-3 and 16pDel. We have included these proof of principle westerns in Figure 5I, 5K, 6I and 6K to show that the effects of these drugs are reproducible across all our NPC lines. We did not include quantification since the data is only from our single proof of principle western.

    1. Author response

      eLife assessment

      Using a genetically controlled experimental setting, the authors find that the lack of Polycomb-dependent epigenetic programming in the oocyte and early embryo influences the developmental trajectory through gestation in the mouse. By showing a two-phase outcome of early growth restriction followed by enhancement, the authors address previous inconsistencies in the field. However, the link with placenta function and gene misregulation is not yet fully supported.

      We thank the Reviewers for their constructive comments. In response we have added significantly more data to the study and substantially rewritten the manuscript. New data include analyses of glucose, amino acid and metabolite levels in fetal and maternal blood samples, more highly resolved fetal growth analyses, a more detailed study of the hyperplastic placenta including IF analyses of labyrinth area, labyrinth to placenta and capillary to labyrinth ratios. We have also added analyses of placental DNA methylation state in offspring from oocytes lacking EED, which reveals a range of DNA methylation changes at imprinted and non-imprinted genes in HET-hom offspring compared to HET-het or WT-wt controls.

      Reviewer #1 (Public Review):

      Oberin, Petautschnig et. al investigated the developmental phenotypes that resulted from oocyte-specific loss of the EED (Embryonic Ectoderm Development) gene - a core component of the Polycomb repressive complex 2 (PRC2), which possess histone methyltransferase activity and catalyses trimethylation of histone H3 at lysine 27 (H3K27). The PRC2 complex plays essential roles in regulating chromatin structure, being an important regulator of cellular differentiation and development during embryogenesis. As novel findings, the authors find that PRC2-dependent programming in the oocyte, via loss of the core component EE2, causes placental hyperplasia and propose that the increase of placental transplacental flux of nutrients leads to fetal and postnatal overgrowth. At the mechanistic level, they show altered expression of genes previously implicated in placental hyperplasia phenotypes. They also establish interesting parallelism with the placental hyperplasia phenotype that is frequently observed in cloned mice.

      Strengths:

      The mouse breeding experiments are very well designed and are powerful to exclude potential confounding genetic effects on the developmental phenotypes that resulted from the loss of EED in oocytes. Another major strength is the developmental profiling across gestation, from pre-implantation to late gestation.

      Weaknesses:

      The evidence for 'oocyte' programming is restricted to phenotypic and gene expression analysis, without measurements of epigenetic dysregulation. It would be an added value if the authors could show evidence for altered H3K27me3 or DNA methylation in the placenta, for example.

      In an earlier previous study we identified a large number of developmentally important genes that accumulated H3K27me3 in primary-secondary stage growing oocytes and were repressed by EED (Jarred et al., 2022 Clinical Epigenetics). However, H3K27me3 was removed from all from these genes during preimplantation development, indicating that maternal inheritance of H3K27me3 at a wide range of genes is unlikely (Jarred et al., 2022 Clinical Epigenetics). Consistent with this only a small number of genes, including Slc38a4 and C2MC, have been shown to be functionally important in H3K27me3-dependent imprinting (Matoba et al., 2022 Genes and Development). Moreover, a related study showed that deletion of Setd2 and consequent loss of H3K36me3 in oocytes led to spreading of H3K27me3 into regions that were otherwise marked by H3K36me3 and DNA methylation (Xu et al. 2019 Nature Genetics 51:844–56). Based on these studies, we proposed that loss of EED and H3K27me3 may result in the ectopic spreading of H3K36me3 and DNA methylation in oocytes and that altered DNA methylation may then be transmitted to offspring and affect developmental outcomes (Jarred et al., 2022 Clinical Epigenetics)

      Given this hypothesis we analysed DNA methylation rather than H3K27me3 in the placenta of WT-wt, HET- het and HET-hom offspring. This revealed differentially methylated regions (DMRs) in HET-hom placentas at two H3K27me3 imprinted genes Sfmbt2 (C2MC) and Mbnl2, five classically imprinted genes and at 74 DMRs not associated with imprinted loci. Together, our data supports the hypothesis from Jarred et al., 2022 Clinical Epigenetics that loss of EED in oocytes results in altered DNA methylation patterning at both imprinted and non-imprinted genes in offspring and that this is likely to affect offspring growth and development. However, whether these changes result from direct alteration of DNA methylation in oocytes remains unclear.

      These new data are now included in results (Lines 387-409), Figure 6I, Supplementary File H-J and Discussion Lines 569-581.

      Reviewer Comment 1. The claim that placental hyperplasia drives offspring catch-up growth is not supported by current experimental data. The authors do not address if transplacental flux is increased in the hyperplastic placentae, measure amino acids and glucose in fetal/maternal plasma, or perform tetraploid rescue experiments to ascertain the contribution of the placenta to growth phenotypes. Furthermore, it is unclear, from the current data, if the surface area for nutrient transport is actually increased in the hyperplastic placenta and the extent to which other cell populations (i.e. spongiotrophoblasts) are affected in addition to glycogen cells. In addition, one of the supporting conclusions that the placenta is a key contributor to fetal overgrowth is based on a very crude measurement - placenta efficiency - which the authors claim is increased in the homozygous mutants compared to controls. After analysing the data carefully, I find evidence for decreased placental efficiency instead. I believe that the authors mistakenly present the data as placenta to fetal weight ratios, which led to the misinterpretation of the 'efficiency' concept.

      We thank the reviewer for pointing out our error in the placental efficiency data and we have now corrected the placental efficiency graphs (fetal/placental weight ratios) and updated the text throughout the manuscript as required (Figure 3I-K). As requested and described below, we have also added significantly more data, which support the conclusion that placental function is not enhanced in HET-hom mice and is unlikely to support fetal growth recovery.

      The new data and analyses we have added include:

      1. Further analyses of glycogen-enriched and non-glycogen-enriched cell counts in the decidua and junctional zones (Figure 4F-J)

      2. Total glycogen cell counts for male and female placentas (Figure 4 – figure supplement 1F)

      3. New analyses of fetal blood glucose levels at E17.5 and E18.5 and matching data from the mothers of each litter (Figure 4M)

      4. New analyses of the circulating amino acid levels and metabolites in fetal blood of E17.5 offspring and matching data from the mothers of each litter (Figure 8)

      5. New IF analyses of CD31 (PECAM-1) and combined this with machine learning assisted quantitative analyses of labyrinth and capillary areas using HALO (Figure 5)

      6. Separated male and female offspring and placental weights at E14.5 and E17.5 and total areas of the placenta, decidua, junctional zone and labyrinth (Figure 3 – figure supplement 1) which provide more insight into potential sex-specific differences in HET-hom offspring and placenta

      We have significantly re-written the results and discussion to reflect our new data and interpretation.

      While we did not assess transplacental flux, our new data revealed: 1. HET-hom fetuses had lower blood glucose levels at E18.5; 2. Circulating levels of amino acids and a wide range of metabolites did not differ between HET-hom and control offspring, or between the mothers of these offspring; 3. HET-hom placentas had lower total labyrinth area, labyrinth/placenta and capillary/labyrinth ratios based on analysis of total capillary and labyrinth areas, indicating that the surface area for nutrient transfer is not increased

      Together these data strongly indicate that hyperplastic HET-hom placentas do not provide greater support to HET-hom fetuses than controls, and that increased placental function in HET-hom offspring is unlikely to explain the late gestation fetal growth recovery we observed in HET-hom offspring or how HET-hom offspring were able to attain normal weights by birth.

      While we have not directly counted the spongiotrophoblast populations, we have now included analyses of both the glycogen-enriched and non-glycogen cell populations in the junctional zone and the decidua (Figure 4H-K). This revealed an increased area of both glycogen-enriched and non-glycogen cells in the junctional zone and in the decidua of HET-hom placentas, consistent with the greater junctional zone/placenta ratio observed in HET-hom placentas (Figure 4D). Together with data in Figure 4C-F and Supp. Fig. 3, our observations demonstrate that the overall decidua and junctional zone areas were increased in HET-hom offspring, but there was a disproportionate expansion of the junctional zone that was caused by increased areas of both glycogen and non-glycogen-enriched cells.

      Tetraploid rescue experiments would require a very significant amount of time and investment and are technically very demanding. While creation of complementary tetraploid offspring would be informative, unfortunately these experiments are beyond the scope of this current study.

      Reviewer Comment 1 cont. The authors do not mention alternative explanations for the observed fetal catch-up and postnatal overgrowth. Why would oocyte epigenetic programming effects be restricted to the placenta, and not include fetal organs?

      Our intention was certainly not to convey a message that effects may be placenta specific. Indeed, our ongoing work beyond the scope of this study provides evidence for effects in other tissues (brain and bones) that will be published elsewhere. Our new data clearly show low placental efficiency, fetal blood glucose, low capillary/labyrinth ratio and no impact on circulating fetal amino acid or metabolite levels in HET-hom offspring. In light of these new data, we have reinterpreted the findings of this study and substantially updated the discussion.

      Given our observations that fetal growth rate markedly increased during late gestation, but placental efficiency was reduced, our data strongly indicate that the effects of altered epigenetic oocyte programming due to loss of Eed affect both the placenta and the fetus. While our findings are significant, the precise mechanism underlying this growth response in HET-hom fetuses remains unknown. Understanding this mechanism will require substantially more work that will be the subject of future studies.

      Reviewer #2 (Public Review):

      Consistent fetal growth trajectories are vital for survival and later life health. The authors utilise an elegant and novel animal model to tease apart the role of Eed protein in the female germline from the role of somatic Eed. The authors were able to experimentally attribute placental overgrowth - particularly of the endocrine region of the placenta - to the function of Eed protein in the oocyte. Loss of Eed protein in the oocyte was also associated with dynamic changes in fetal growth and prolonged gestation. It was not determined whether the reported catch-up growth apparent on the day of birth was due to enhanced fetal growth very late in gestation, a longer gestational time ie the P0 pups are effectively one day "older" compared to the controls, or the pups catching up after birth when consuming maternal milk.

      To understand if increased growth occurred in HET-hom fetuses prior to birth, we have now included analyses of offspring weight at E18.5 (Figure 2F), all pups collected with a verified E19.5 birth date (Figure 2J) and for pups from similar litter sizes (5-7 pups) at E19.5 (Figure 2K). Together with our existing data, these additional analyses provide average weights for fetuses at E14.5, E17.5, E18.5 and pups born on E19.5. This confirmed that HET-hom offspring undergo enhanced growth in the last few days of pregnancy, resulting in the progression of substantially growth and developmentally restricted HET-hom fetuses at E14.5, to pups with normal weight at birth within the 40% of pregnancies that were born on E19.5 in a normal gestational time.

      However, in addition, gestational length was increased by one to two days in 60% of pregnancies from hom oocytes, but not in control pregnancies from het or wt oocytes. As average weights were significantly greater in all surviving HET-hom offspring at P0 (i.e. surviving pups born on E19.5-E21.5; Figure 2G), it appears that this additional gestational time contributed to the offspring overgrowth. This is logical, however it does not explain how growth and developmentally delayed fetuses at E14.5 attained normal weight and developmental stage by E19.5 (Figure 2J-K).

      Together our data clearly show that HET-hom offspring undergo enhanced growth during the late stages of pregnancy, allowing them to resolve the developmental delay and growth insufficiency observed at E14.5 so that they were born at normal weight and stage at E19.5. In addition, increased gestational time contributes to weight of pups delivered on E20.5 or 21.5, partly explaining the overgrowth phenotype observed in this model.

      The idea that increased milk consumption may explain the overgrowth of HET-hom offspring is interesting. It is possible that the increased growth rate of HET-hom offspring continues after birth and contributes to overgrowth. However, examining this outcome in a tightly controlled manner is complicated given that we cannot predict the day of birth of HET-hom litters, and that these litters are generally small and would need to be fostered on the day of birth alongside control litters. Given these challenges and that our primary observation is that HET-hom offspring underwent fetal growth recovery during pregnancies of normal length and via extension of gestational length, we have not examined the possibility of increased milk consumption after birth.

      We have updated the results to reflect the new analyses and have provided relevant discussion to address these data. Our description of these data can be found in Results (lines 165-197) and in Figure 2.

      Reviewer #3 (Public Review):

      My understanding of the main claims of the paper, and how they are justified by the data are discussed below:

      Overall, loss of PRC2 function in the developing oocyte and early embryo causes:

      1) Growth restriction from at least the blastocyst stage with low cell counts and midgestational developmental delay.

      Strengths:

      • Live embryo imaging added an important dimension to this study. The authors were able to confirm an unquantified finding from a previous lab (reduced time to 2-cell stage in oocyte-deletion Eed offspring, Inoue 2018, PMID: 30463900) as well as identify developmental delay and mortality at the blastocyst- hatching transition.

      • For the weight and morphological analysis the authors are careful to provide isogenic controls for most of the experiments presented. This means that any phenotypes can be attributed to the oocyte genotype rather than any confounding effects of maternal or paternal genotype.

      • Overall, there is good evidence that oocyte deletion of Eed results in early embryonic growth restriction, consistent with previous observations (Inoue 2018, PMID: 30463900).

      Reviewer 3, Comment 1: Weaknesses: Gaps in the reporting of specific features of the methodology make it difficult to interpret/understand some of the results.

      While we are unsure exactly which methods Reviewer 3 would like expanded, we have updated parts that we thought required further detail and allow more informed interpretation of the results. These include methods for placental histology (Lines 650-669) and immuno- histochemistry (Lines 671-690), and new methods for CD31 immunofluorescence (Lines 692-714), glucose and metabolomics (Lines 752-769) and DNA methylation (RRBS; Lines 734-750) analyses.

      To clarify the approach taken for histology, immunohistochemical and immunofluorescent staining, sections were cut in compound series from the centre of each placenta, ensuring that we collected representative data for each sample. QuPath was used to quantify the decidual and junctional zone areas in one complete, fully intact midline section for each placenta as close to the midline as possible. This provided data from 10 placentas for each genotype. In addition, glycogen-enriched and non-glycogen-enriched cells were identified and quantified using machine learning assisted QuPath analyses of the whole placenta, decidua and junctional zone regions. We have also added quantitative analyses of the labyrinth and labyrinth capillary network using immunofluorescent CD31 staining and machine learning assisted HALO software. This new analysis of placental morphology is included in the methods section.

      Moreover, as there were no sex-specific differences in placental morphology or weight, we combined the samples from both sexes to provide greater numbers for analysis in each genotype. For example, as described for the analyses of labyrinth and capillaries using CD31 IF, 4 placentas of each sex were used for data collection. This provided data from a total of 8 placentas (4 male and 4 female) for each genotype from a total of 17 WT-wt (9 male and 8 female), 21 HET-het (9 male and 12 female) and 24 HET-hom (16 male and 8 female) sections (2-3 sections/placenta).

      Reviewer 3, Comment 2: Placental hyperplasia with disproportionate overgrowth of the junctional trophoblast especially the glycogen trophoblast (GlyT) cells.

      Strengths: • The authors provide a comprehensive description of how placental and embryo weight is affected by the oocyte-Eed deletion through mid-to-late gestation development. The case for placentomegaly is clear.

      Weaknesses:

      • The placental efficiency data presented in Figure 3G-I is incorrect. Placental efficiency is calculated as embryo mass/placental mass, and it increases over the late gestation period. For e14.5 for example (Fig3G), WT-wt embryo mass = ~0.3g, placenta mass = 0.11g (from Fig 3D) = placental efficiency 2.7; HET-hom = 0.25/0.12 = 2.1. The paper gives values: WT-wt 0.5, HET-hom 0.7. Have the authors perhaps divided placenta weight by embryo mass? This would explain why the E17.5 efficiencies are so low (WT-wt 0.11 rather than a more usual figure of 8.88. If this is the case then the authors' conclusion that placental efficiency is improved by oocyte deletion of Eed is wrong - in fact, placental efficiency is severely compromised.

      The authors have performed cell type counting on histological sections obtained from placentas to discover which cells are contributing to the placentomegaly. This data is presented as %cell type area in the main figure, though the untransformed cross-sectional area for each cell type is shown in the supplementary data. This presentation of the data, as well as the description of it, is misleading because, while it emphasises the proportional increase in the endocrine compartment of the placenta it downplays the fact that the exchange area of the mutant placentas is vastly expanded. This is important for two reasons.

      Firstly, the whole placenta is increased in size suggesting that the mechanism is not placental lineage- specific and instead acting on the whole organ. Secondly in relation to embryonic growth, generally speaking, genetic manipulations that modify labyrinthine volume tend to have a positive correlation with fetal mass whereas the relationship between junctional zone volume and embryonic mass is more complex (discussed in Watson PMID: 15888575, for example). The authors should reconsider how they present this data in light of the previous point.

      We thank the reviewer for pointing out our error in the placental efficiency analysis and apologise for this error. We have corrected the presentation and interpretation of these data and have described this in detail in our response to Reviewer 1, Comment 1.

      As discussed in our response to Reviewer 1, Comment 1, we have added a range of analyses to determine whether placental efficiency was enhanced in HET-hom offspring. These include measuring fetal and maternal circulating glucose levels (Figure 4K), individual amino acids and an extensive range of metabolites (Figure 8) and providing CD31 immunofluorescent analyses of labyrinth area, labyrinth/placental ratio and capillary/labyrinth ratio in HET-hom and control placentas (Figure 5).

      We also added analyses of glycogen enriched and non-glycogen-enriched cell counts in the decidua and junctional zones. As suggested by Reviewer 3, both glycogen-enriched and non-enriched cell populations are significantly increased in HET-hom placentas.

      Combined, these new analyses significantly expand the study and support the conclusion that placental efficiency in HET-hom offspring was either compromised or not different from controls, depending on the analysis. We find no evidence that placental efficiency was increased in HET-hom offspring and have reworked our results and discussion sections to reflect these new data and interpretation.

      Reviewer 3, Comment 2 cont: Again, some of the methods are not clearly reported making interpretation difficult - especially how they have estimated their GlyT number.

      As outlined in our response to Reviewer 3 Comment 1, in the methods section we have added further detail of how we counted glycogen-enriched and non-enriched cells in the decidua and junctional zone regions of sections for the middle of WT-wt, WT-het, HET-het and HET-hom placentas (Lines 650-669).

      Reviewer 3, Comment 3: Perinatal embryonic/pup overgrowth.

      Strengths:

      • The overgrowth exhibited by the oocyte-Eed-deleted pups is striking and confirms the previous work by this group (Prokopuk, 2018). This is an important finding, especially in the context of understanding how PRC2-group gene mutations in humans cause overgrowth syndromes. It is also intriguing because it indicates that genetic/environmental insults in the mother that affect her gamete development can have long-term consequences on offspring physiology.

      Weaknesses:

      • Is the overgrowth intrauterine or is it caused by the increase in gestation length? The way the data is reported makes it impossible to work this out. The authors show that gestation time is consistently lengthened for mothers incubating oocyte-Eed-deleted pups by 1-2 days. In the supplementary material, the mutant embryos are not larger than WT at e19.5, the usual day of birth. Postnatal data is presented as day post-parturition. It would probably be clearer to present the embryonic and postnatal data as days post coitum. In this way, it will be obvious in which period the growth enhancement is taking place. This is information really important to determine whether the increased growth of the mutants is due to a direct effect of the intrauterine environment, or perhaps a more persistent hormonal change in the mother that can continue to promote growth beyond the gestation period.

      We have used embryonic day (E) to denote embryo and fetal age throughout the study – this is the same as using DPC (i.e. E19.5 is equivalent to 19.5 DPC). As described in the Methods “Collection of post-implantation embryos, placenta and postnatal offspring”, mice were time mated for two-four nights, with females plug checked daily. Positive plugs were noted as day E0.5.

      To make the data presentation clearer, we have shown the data for surviving HET-hom pups born on E19.5 (Figure 2J) separately from all HET-hom surviving pups born on E19.5-E21.5. (Figure 2G). As discussed in our response to Reviewer 2, we have also included growth data for pregnancies at E14.5, E17.5, E18.5 (Fig. 2C-F) and E19.5 (Figure 2J,K), as well as P0 (combined data for surviving pups born E19.5-E21.5), and P3 (combined data for surviving pups born E19.5-E21.5, Figure 2G,H).

      These data clearly show that HET-hom fetuses are substantially growth and developmentally delayed at E14.5 (Figure 2D), but HET-hom pups born on E19.5 are the same weight as WT-wt, WT-het and HET-het control pups (Figure 2J). This demonstrates that weight of HET-hom fetuses is normalised in utero between E14.5 and day of birth on E19.5.

      Importantly, as requested by Reviewer 3, we have separated average weight for all surviving pups with a day of birth of E19.5-21.5 (Figure 2G) from average weight of pups born on E19.5 only (Figure 2J). These analyses revealed that the average weight of surviving pups born between E19.5-21.5 was significantly higher than for controls (Figure 2G), but the average weight of pups born on E19.5 only was not. It is therefore clear that extended gestation also contributed to increased HET-hom pup birth weight. We have updated these additional analyses in Results (Lines 165-197) and Figure 2

      As revealed in Figure 2H, it is also possible/likely that growth of HET-hom pups during the three days post- partum may have contributed to the offspring overgrowth we observed in this and our previous study (Prokopuk et al., 2018 Clinical Epigenetics). However, we cannot determine whether there is a contribution from a persistent maternal hormonal change that promotes post-natal offspring growth or whether there is an innate growth benefit in HET-hom pups. As this is very difficult to dissect, separating these possibilities is beyond the scope of our study.

      Reviewer 3, Comment 4: "fetal growth restriction followed by placental hyperplasia, .. drives catch-up growth that ultimately results in perinatal offspring overgrowth".

      Here the authors try to link their observations, suggesting that i) the increased perinatal growth rate is a consequence of placentomegaly, and ii) the placentomegaly/increased fetal growth is an adaptive consequence of the early growth restriction. This is an interesting idea and suggests that there is a degree of developmental plasticity that is operating to repair the early consequences of transient loss of Eed function.

      Strengths:

      • Discrepancies between earlier studies are reconciled. Here the authors show that in oocyte-Eed-deleted embryos growth is initially restricted and then the growth rate increases in late gestation with increased perinatal mass.

      Weaknesses:

      • Regarding the dependence of fetal growth increase on placental size increase, this link is far from clear since placental efficiency is in fact decreased in the mutants (see above).

      • "Catch-up growth" suggests that a higher growth rate is driven by an earlier growth restriction in order to restore homeostasis. There is no direct evidence for such a mechanism here. The loss of Eed expression in the oocyte and early embryo could have an independent impact on more than one phase of development.

      Firstly, there is growth restriction in the early phase of cell divisions. Potentially this could be due to depression of genes that restrain cell division on autosomes, or suppression of X-linked gene expression (as has been previously reported, Inoue, 2018 PMID: 30463900). The placentomegaly is explained by the misregulation of non-canonically imprinted genes, as the authors report (and in agreement with other studies, e.g. Inoue, 2020. PMID: 32358519).

      • Explaining the perinatal phase of growth enhancement is more difficult. I think it is unlikely to be due to placentomegaly. Multiple studies have shown that placentomegaly following somatic cell nuclear transfer (SCNT) is caused by non-canonically imprinted genes, and can be rescued by reducing their expression dosage. However, SCNT causes placentomegaly with normal or reduced embryonic mass (for example -Xie 2022, PMID: 35196486), not growth enhancement. Moreover, since (to my knowledge) single loss of imprinting models of non-canonically imprinted genes do not exist, it is not possible to understand if their increased expression dosage can drive perinatal overgrowth, and if this is preceded by growth restriction and thus constitutes 'catch up growth'.

      Reviewer 3 is correct in their assessment that placental efficiency was decreased in HET- hom offspring and we have corrected the placental efficiency analysis based on fetal/placental weight ratios (discussed in detail in our response to Reviewer 1 Comment 1). We have added substantially more data (glucose, amino acids, metabolites, labyrinth capillary area and density). These data support the conclusion that a placentally driven advantage for HET-hom fetal growth is unlikely, despite our observation that HET- hom fetuses are developmental delayed and underweight at E14.5, but are born at normal weight after a normal gestational length (19.5 days) (discussed in our responses to Reviewer 3, Comment 3 and Reviewer 2).

      This demonstrates that HET-hom fetuses are able to attain normal birth weight despite being initially growth restricted state at E14.5, and that this occurs despite low placental function. Moreover, as we compared isogenic offspring with heterozygous loss of Eed (Het-het compared to HET-hom offspring) the outcomes we observed in HET-hom offspring originate from loss of EED in the growing oocyte or loss of maternal EED in the zygote strongly suggesting that a non-genetic mechanism is involved.

      As pointed out by Reviewer 3, the initial developmental delay in HET-hom offspring may be due to increased expression of genes that regulate cell proliferation – this could clearly explain the lower number of cells we observed in the ICM and the growth delay at later stages of embryonic and fetal development. Another possibility is that maternal PRC2 provided by the oocyte promotes cell divisions in preimplantation embryos We have discussed these possibilities on Lines 467-476.

      In addition, Matoba et al 2022 demonstrated that deletion of maternal Xist together with Eed was able to rescue male-biased lethality in offspring from oocytes lacking Eed, revealing a clear role for X-linked genes in this phenotype (Matoba et al 2022, Genes and Development). However, deletion of maternal Xist did not properly normalise survival offspring from Eed null oocytes (i.e. Eed/Xist double maternal null litters were smaller than litters derived from wild type oocytes) strongly suggesting other mechanisms provide the capacity for HET-hom offspring to attain normal weight at birth. We have added further discussion of the Matoba study in the context of our study on of the Discussion (Lines 544-555)

      Finally, with respect to the outcomes for SCNT derived offspring, we extracted SCNT fetal growth and placental weight data from the supplementary data included in Matoba et al., 2018 Cell Stem Cell. 2018;23(3):343-54.e5 and compared it with data collected in our study (Figure 7). This analysis revealed that the weights of placentas and fetuses of offspring derived via SCNT were very similar to the HET-hom offpsring in our study and we have discussed the similarities and potential differences between HET-hom and SCNT offspring in the Discussion (Lines 478-500).

      As pointed out by Reviewer 3, deletion of maternal non-canonically imprinted genes partially or fully rescued the placental hyperplasia phenotype in both SCNT derived and offspring from oocyte lacking EED. However, as we have discussed, the mechanisms underlying other aspects of the offspring phenotype, such as fetal growth recovery of HET-hom offspring observed in our study, remain unknown. Moreover, the comparison we provide in Figure 7 strongly indicates that HET-hom and SCNT fetuses are similarly delayed at E14.5 and undergo similar fetal growth recovery before birth, but the mechanism also remains unknown. Together, it appears that offspring derived from either Eed-null oocytes or by SCNT have an innate ability to remediate fetal growth restriction during the late stages of pregnancy without a requirement to correct maternally inherited impacts mediated by Xist or H3K27me3-dependent imprinting.

    1. Author response

      Reviewer #1 (Public Review):

      The main contribution appears to be related to functional specialization. I suggest clarifying the major novelty of the present report and to focus the introduction on it.

      We thank this reviewer for this suggestion. We have revised the introduction to emphasize the functional specialization question. The changes are extensive; we have included a tracked-changes version of the manuscript to make these edits easy to see.

      There is a growing literature on fluctuating neural firing patterns that is not considered in this report. The scholarship appears a bit impoverished with only 19 references, many of which point to work from this group of collaborators. I suggest that the authors consider the present work in the context of the wider literature more scholarly, even if not all the relations of these different lines of work can be conclusively connected at this point. For a few examples, there is work by Kienitz and colleagues on fluctuating neural patterns in V4 evoked by competing grating stimuli. Also, the work by Engel, Moore, and colleagues on 'on' and 'off' states in the context of selective attention seems relevant, or the work by Fiebelkorn and Kastner on rhythmic perception and attention.

      We agree completely with this suggestion! We have reworded the introduction to be more inclusive of other research in this area (especially Kienitz and colleagues – exciting work that we are pleased to have had brought to our attention) and we have added about 500 words in the Discussion to cover the work on on/off states (Engel et al.), rhythmic perception (Fiebelkorn & Kastner and others), and attention more generally (e.g., Triesman & Gelade’s work on serial sampling). We are particularly pleased to add these sections because these topics are very much on our minds – we have a commentary piece under review elsewhere in which we evaluate these synergistic lines of approach in a more complete fashion. In total, we’ve added about 15 additional references.

      Reviewer #2 (Public Review):

      The description of the results would benefit from a better explanation of how low spike counts may influence the outcome of the analysis. Due to a smoothing procedure used for visualization, the spike counts for the paired stimuli (AB, black lines) shown in Figure 3a-b and Figure 4a-d go below 0. However, the actual spike count on a trial can not go below 0. The symmetric smoothing procedure may hide an underlying skewed distribution of spike counts that can only be positive. The statistical analysis is not performed on the smoothed distribution but on the actual spike counts, and the validity of the result is therefore not in question. However, the paper would benefit from 1) visualization of the unsmoothed trial counts, and 2) an explanation of how assumptions of symmetric/skewed distributions may affect the outcome.

      We thank the reviewers for noting this and making these suggestions. We now include unsmoothed raw spike counts in all the example figures (Figure 3a-b and Figure 4a-d). With regard to the symmetric/skewed distributions and the analysis methods, a Poisson distribution will be skewed at low rates and become more symmetric at higher rates, so this is already incorporated into the analysis. Indeed, the utility of Poisson distributions for fitting non-negative data is one of the reasons these distributions are so commonly used in neuroscience. We now make this point explicitly at the beginning of Methods/Data analysis: “Our method centers on modeling spike counts based on Poisson distributions, a common technique for handling non-negative count data in neuroscience and other fields.” With this edit as well as the revised example figures now making clear that no spike counts are below zero, we are optimistic that readers will better understand the analysis method and how the shape of response distributions are incorporated into it.

    2. eLife assessment

      This important study adds to the growing body of evidence that neural responses fluctuate in time to alternatively represent one among multiple concurrent stimuli and that these fluctuations seize when objects fuse into one perceived object. The present study provides solid evidence from multiple brain areas and stimuli types to support this hypothesis. Overall, the study illustrates how the brain can use time dimension and synchrony to either parse or integrate stimuli into a coherent representation.

    3. Reviewer #1 (Public Review):

      The study by Schmehl and colleagues asks an important question, i.e. how are multiple objects/stimuli represented in the visual system despite broad tuning properties of neurons along multiple different dimensions (e.g. space, features). This is a continuation of an impactful and highly significant line of work from the Groh lab and their collaborators. In previous work, they showed that fluctuations in firing patterns may be critical in representing multiple objects and parse them in time. In this particular study, the authors ask three specific questions to extend these observations: (i) Are such fluctuations widespread in the visual system?; (ii) Are they related to the perceptual distinction of objects?; (iii) And how are they related to the functional specialization of neuronal populations along feature dimensions (e.g. faces, motion).

      It seems to me that there is ample evidence for the first two questions from previous work by these authors. For (i), fluctuations in firing patterns related to multiple stimuli have been shown in the auditory (e.g. inferior colliculus, Caruso et al., 2018) and multiple areas of the visual system (i.e. V1, V4, and the face patch system; Caruso et al., 2018; Jun et al., 2022). The present study adds data from MT to this increasing evidence. For (ii), Jun et al., 2022 already showed that fluctuations are not related to stimuli perceived as merged, or not distinct. Thus, the main contribution appears to be related to functional specialization. I suggest clarifying the major novelty of the present report and to focus the introduction on it.

      The present work analyzed three different data sets acquired in different areas (V1, V4, MT, IT face network), using different feature stimuli (motion, faces), obtained under various attention conditions/states (passive fixation, actively ignored). Many of the results are nice confirmations and minor extensions of previous work. The conceptual advance and novelty of the findings are therefore limited.

      There is a growing literature on fluctuating neural firing patterns that is not considered in this report. The scholarship appears a bit impoverished with only 19 references, many of which point to work from this group of collaborators. I suggest that the authors consider the present work in the context of the wider literature more scholarly, even if not all the relations of these different lines of work can be conclusively connected at this point. For a few examples, there is work by Kienitz and colleagues on fluctuating neural patterns in V4 evoked by competing grating stimuli. Also, the work by Engel, Moore, and colleagues on 'on' and 'off' states in the context of selective attention seems relevant, or the work by Fiebelkorn and Kastner on rhythmic perception and attention.

    4. Reviewer #2 (Public Review):

      In a beautiful line of work, the authors have proposed the intriguing idea that activity patterns of neurons can fluctuate between representing one of multiple stimuli in its receptive field. This allows for time-multiplexing of information by neural populations. The idea was initially proposed by Caruso et al (2018) and tested for both auditory and visual stimuli and later extended in Jun et al (2022). The current study analyzes additional datasets to further extend the conclusions across multiple areas and different stimulus sets.

      Together with the earlier work, the current study provides solid evidence for the hypothesis that fluctuating activity patterns in neurons representing multiple stimuli may be a general phenomenon. This exciting possibility may have implications for the studies of perception, attention, decision-making, and other cognitive functions.

      In the current study, the claim that the fluctuating activity patterns may be a general phenomenon is supported by multiple data sets from area MT and face patches MF and AL in IT cortex, using multiple stimulus sets (moving dots and gratings for MT, and face-face and face-object pairs for IT cortex). The major strength of this study is the consistency of the results across these areas and stimulus sets.

      The description of the results would benefit from a better explanation of how low spike counts may influence the outcome of the analysis. Due to a smoothing procedure used for visualization, the spike counts for the paired stimuli (AB, black lines) shown in Figure 3a-b and Figure 4a-d go below 0. However, the actual spike count on a trial can not go below 0. The symmetric smoothing procedure may hide an underlying skewed distribution of spike counts that can only be positive. The statistical analysis is not performed on the smoothed distribution but on the actual spike counts, and the validity of the result is therefore not in question. However, the paper would benefit from 1) visualization of the unsmoothed trial counts, and 2) an explanation of how assumptions of symmetric/skewed distributions may affect the outcome.

      Overall, the authors have presented an interesting hypothesis that is supported by rigorous analysis, they clearly described the results, and they have given a fair discussion of what we can and cannot conclude from this dataset. This line of work deserves the attention of a broad audience within the field of neuroscience.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We thank the editors and reviewers for their helpful comments, which have allowed us to improve the manuscript.

      Response to reviewer 2

      We thank the reviewer for this positive feedback, which requires no further revision.

      Response to reviewer 3

      We thank the reviewer for highlighting these additional points and provide further explanations on these below.

      Firstly, we started the analysis from a baseline of year 2000 because the largest international donor (the Global Fund) uses baseline malaria levels in the period 2000-2004 as the basis of their current allocation calculations (The Global Fund, Description of the 2020-2022 Allocation Methodology, December 2019). In the paper we compare our optimal strategy to a simplified version of this method, represented by our “proportional allocation” strategy.

      Even if our simulations started in the year 2015, a direct comparison with the Global Technical Strategy for Malaria 2016-2030 would not be possible due to the different approaches taken. The GTS was developed to progress towards malaria elimination globally and set ambitious targets of at least 90% reduction in malaria case incidence and mortality rates and malaria elimination in at least 35 countries by 2030 compared to 2015. Mathematical modelling at the time suggested that 90% coverage of WHO-recommended interventions (vector control, treatment and seasonal malaria chemoprevention) would be needed to approach this target (Griffin et al. 2016, Lancet Infectious Diseases). The global annual investment requirements to meet GTS targets were estimated at US$6.4 billion by 2020 and US$8.7 billion by 2030 (Patouillard et al. 2017, BMJ Global Health). This strategy therefore considers what resources would be required to achieve a specific global target, but not the optimized allocation of resources.

      Investments into malaria control have consistently been below the estimated requirements for the GTS milestones (World Health Organization 2022, World Malaria Report 2022). In our study, we therefore take a different perspective on how limited budgets can be optimally allocated to a single intervention (insecticide-treated nets) across countries/settings to achieve the best possible outcome for two objectives that are different to the GTS milestones (either minimizing the global case burden, or minimizing both the global case burden and the number of settings not having yet reached a pre-elimination phase). As stated in the discussion, our estimate of allocating 76% of very low budgets to high-transmission settings was similar to the global investment targets estimated for the GTS, where the 20 countries with the highest burden in 2015 were estimated to require 88% of total investments (Patouillard et al. 2017, BMJ Global Health). Nevertheless, we also show that if higher budgets were available, allocating the majority to low-transmission settings co-endemic for P. falciparum and P. vivax would achieve the largest reduction in global case burden. We acknowledge the modelling of a single intervention as one of the key limitations of this analysis, but this simplification was necessary in order to perform the complex optimisation problem. Computationally it would not have been feasible to optimize across a multitude of intervention and coverage combinations.

      A further limitation raised by the reviewer is the lack of cross-species immunity between P. falciparum and P. vivax in our model. While cross-reactivity between antibodies against these two species has been observed in previous studies and the potential implications of this would be important to explore in future work, we did not include it here as little is known to date about the epidemiological interactions between different malaria parasite species (Muh et al. 2020, PLoS Neglected Tropical Diseases).

      Lastly, we did not assume that transmission was homogenous within the four transmission settings in our study (very low, low, moderate, high); transmission dynamics were simulated separately in each country, accounting for heterogeneous mosquito bite exposure. However, results were summarised for the broader transmission settings since many other country-specific factors were not accounted for (see discussion) and the findings should not be used to inform individual country allocation decisions.


      The following is the authors’ response to the original reviews.

      Author response to peer review

      We thank the reviewers for their insightful comments, which raise several important points regarding our study. As the reviewers have recognised, we introduced a number of simplifications in order to perform this complex optimisation problem, such as by restricting the analysis to a single intervention (insecticide-treated nets) and modelling countries at a national level. Despite their clear relevance to the study, computationally it would not have been feasible to run the multitude of scenarios suggested by reviewer 1, which we recognise as a limitation. As such we agree with the assessment that this study primarily represents a thought experiment, based on substantive modelling and aggregate scenario-based analysis, to assess whether current policies are aligned with an optimal allocation strategy or whether there might be a need to consider alternative strategies. The findings are relevant primarily to global funders and should not be used to inform individual country allocation decisions, and also point to avenues for further research. This perspective also underlies our decision to start the analysis from a baseline of year 2000 as opposed to modelling the current 2023 malaria situation: the largest international donor (the Global Fund) uses baseline malaria levels in the period 2000-2004 as the basis of their allocation calculations (The Global Fund, Description of the 2020-2022 Allocation Methodology, December 2019) (1). A simplified version of this method is represented by our “proportional allocation” strategy. We have made several revisions to the manuscript to address the points raised by the reviewers, as detailed below.

      Reviewer #1 (Public Review):

      1. The authors present a back-of-the-envelope exploration of various possible resource allocation strategies for ITNs. They identify two optimal strategies based on two slightly different objective functions and compare 3 simple strategies to the outcomes of the optimal strategies and to each other. The authors consider both P falciparum and P vivax and explore this question at the country level, using 2000 prevalence estimates to stratify countries into 4 burden categories. This is a relevant question from a global funder perspective, though somewhat less relevant for individual countries since countries are not making decisions at the global scale.

      Thank you for this summary of the paper. We agree that our analysis is of relevance to global funders, but is not meant to inform individual country allocation decisions. In the discussion, we now state:

      p. 12 L19: “Therefore, policy decisions should additionally be based on analysis of country-specific contexts, and our findings are not informative for individual country allocation decisions.”

      1. The authors have made various simplifications to enable the identification of optimal strategies, so much so that I question what exactly was learned. It is not surprising that strategies that prioritize high-burden settings would avert more cases.

      Thank you for raising this point. Indeed, several simplifying assumptions were necessary to ensure the computational feasibility of this complex optimization problem. As a result, our study primarily represents a thought experiment to assess whether current policies are aligned with an optimal allocation strategy or whether there might be a need to consider alternative strategies. As now further outlined in the introduction, approaches to this have differed over time and it remains a relevant debate for malaria policy.

      p. 2 L22: “However, there remains a lack of consensus on how best to achieve this longer-term aspiration. Historically, large progress was made in eliminating malaria mainly in lower-transmission countries in temperate regions during the Global Malaria Eradication Program in the 1950s, with the global population at risk of malaria reducing from around 70% of the world population in 1950 to 50% in 2000 (2). Renewed commitment to malaria control in the early 2000s with the Roll Back Malaria initiative subsequently extended the focus to the highly endemic areas in sub-Saharan Africa (3).”

      We believe our findings not only confirm an “expected” outcome – that prioritizing high-burden settings would avert more cases – but also clearly illustrate various consequences of different allocation strategies that are implemented or considered in reality, which may not be so obvious. For example, we found that initially allocating a larger share of the budget to high-transmission countries could be both almost optimal in terms of reducing clinical cases and maximising the number of countries reaching pre-elimination. We also observed a trade-off between reducing burden and reducing the global population at risk (“shrinking the map”) through a focus on near-elimination settings, and estimate the loss in burden reduction when following an elimination target.

      1. Generally, I found much of the text confusing and some concepts were barely explained, such that the logic was difficult to follow.

      Thank you for bringing this to our attention, and we regret to hear the manuscript was confusing to read. We believe that the revisions made as a result of the reviewer comments have now made the manuscript much easier to follow. We additionally passed the manuscript to a colleague to identify confusing passages, and have added a number of sentences to clarify key concepts and improve the structure.

      1. I am not sure why the authors chose to stratify countries by 2000 PfPR estimates and in essence explore a counterfactual set of resource allocation strategies rather than begin with the present and compare strategies moving forward. I would think that beginning in 2020 and modeling forward would be far more relevant, as we can't change the past. Furthermore, there was no comparison with allocations and funding decisions that were actually made between 2000 and 2020ish so the decision to begin at 2000 is rather confusing.

      Thank you for pointing this out. We have now made the rationale for this choice clearer in the manuscript. Our main reason for this was to allow comparison with the Global Fund funding allocation, which is largely based on malaria disease burden in 2000-2004. As stated in the paper, malaria prevalence estimates in the year 2000 are commonly considered to represent a “baseline” endemicity level, before large-scale implementation of interventions in the following decades. In the manuscript, the transmission-related element of the Global Fund allocation algorithm is represented in our “proportional allocation” strategy. Previously this was only mentioned in the methods, but we have now added the following in the results to address this comment of the reviewer:

      p. 6 L12: “Strategies prioritizing high- or low-transmission settings involved sequential allocation of funding to groups of countries based on their transmission intensity (from highest to lowest EIR or vice versa). The proportional allocation strategy mimics the current allocation algorithm employed by the Global Fund: budget shares are mainly distributed according to malaria disease burden in the 2000-2004 period. To allow comparison with this existing funding model, we also started allocation decisions from the year 2000.”

      The Global Fund framework additionally considers economic capacity and other specific factors, and we have now also included a direct comparison with the 2020-2022 Global Fund allocation in Supplementary Figure S12 (see Author response image 1).

      We agree that looking at allocation decisions from 2020 onward would also constitute a very interesting question. However, the high dimensionality in scenarios to consider for this would currently make it computationally infeasible to run on the global level. Not only would it have to include all interventions currently implemented and available for malaria at different levels of coverage, but also the option of scaling down existing interventions. Instead, our priority in this paper was to conduct a thought experiment including both P. falciparum and P. vivax on a large geographical scale.

      Author response image 1.

      Impact of the proportional allocation strategy and the 2020-2022 Global Fund allocation on global malaria cases (panel A) and the total population at risk of malaria (panel B) at varying budgets. Both strategies use the same algorithm for budget share allocation based on malaria disease burden in 2000-2004, but the Global Fund allocation additionally involves an economic capacity component and specific strategic priorities.

      1. I realize this is a back-of-the-envelope assessment (although it is presented to be less approximate than it is, and the title does not reveal that the only intervention strategy considered is ITNs) but the number and scope of modeling assumptions made are simply enormous. First, that modeling is done at the national scale, when transmission within countries is incredibly heterogeneous. The authors note a differential impact of ITNs at various transmission levels and I wonder how the assumption of an intermediate average PfPR vs modeling higher and lower PfPR areas separately might impact the effect of the ITNs.

      Thank you for this comment. We agree the title could be more specific and have changed this to “Resource allocation strategies for insecticide-treated bednets to achieve malaria eradication”.

      Regarding the scale of ITN allocation, it is true that allocation at a sub-national scale could affect the results. However, considering this at a national scale is most relevant for our analysis because this is the scale at which global funding allocation decisions are made in practice. A sentence explaining this has been added in the methods.

      p. 15 L8: “The analysis was conducted on the national level, since this scale also applies to funding decisions made by international donors (1).”

      Further considering different geographical scales would also require introducing other assumptions, for example about how different countries would distribute funding sub-nationally, whether specific countries would take cooperative or competitive approaches to tackle malaria within a region or in border areas, and about delays in the allocation of bednets in specific regions. These interesting questions were outside of the scope of this work, but certainly require further investigation.

      1. Second, the effect of ITNs will differ across countries due to variations in vector and human behavior and variation in insecticide resistance and susceptibility to the ITNs. The authors note this as a limitation but it is a little mind-boggling that they chose not to account for either factor since estimates are available for the historical period over which they are modeling.

      Thank you for pointing this out. We did consider this and mentioned it as a limitation. Nevertheless, the complexity of accounting for this should also be recognised; for example, there is substantial uncertainty about the precise relationship between insecticide resistance and the population-level effect of ITNs (Sherrard-Smith et al., 2022, Lancet Planetary Health) (4). Additionally, our simulations extend beyond the 2000-2023 period so further assumptions about future changes to these factors would also be required. Simplifying assumptions are inherent to all mathematical modelling studies and we consider these particular simplifications acceptable given the high-level nature of the analysis.

      1. Third, the assumption that elimination is permanent and nothing is needed to prevent resurgence is, as the authors know, a vast oversimplification. Since resources will be needed to prevent resurgence, it appears this assumption may have a substantial impact on the authors' results.

      Thank you for this comment. In the discussion, we have now expanded on this:

      p. 13 L3: “While our analysis presents allocation strategies to progress towards eradication, the results do not provide insight into allocation of funding to maintain elimination. In practice, the threat of malaria resurgence has important implications for when to scale back interventions.”

      We believe that from a global perspective, the questions of funding allocation to achieve elimination vs to maintain it can currently still be considered separately given the large time-scales involved. The cost of preventing resurgence is not known, and one major problem in accounting for this would also be to identify relevant timescales to quantify this over.

      1. The decision to group all settings with EIR > 7 together as "high transmission" may perhaps be driven by WHO definitions but at a practical level this groups together countries with EIR 10 and EIR 500. Why not further subdivide this group, which makes sense from a technical perspective when thinking about optimal allocation strategies?

      Thank you for pointing this out. The WHO categories used are better interpreted in terms of the corresponding prevalence, which places countries with a prevalence of over 35% in the high transmission categories (WHO Guidelines for malaria, 31 March 2022) (5). We felt this is appropriate given that we are looking at theoretical global allocation patterns and do not aim to make recommendations for specific groups of countries or individual countries within sub-Saharan Africa that would be distinguished through the use of higher cut-offs. In our analysis, all 25 countries in the high transmission category were located in sub-Saharan Africa.

      1. The relevance of this analysis for elimination is a little questionable since no one eliminates with ITNs alone, to the best of my understanding.

      Thank you for this comment. We indeed state in the paper that ITNs alone are not sufficient to eliminate malaria. However, we still think that our analysis is relevant for elimination by taking a more theoretical perspective on reducing transmission using interventions. Starting from the 2000 baseline (or current levels) globally, large-scale transmission reductions such as those achieved by mass ITN distribution still represent the first key step on the path to malaria eradication, as shown in previous modelling work (Griffin et al., 2016, Lancet Infectious Diseases) (6). In the final phase of elimination, the WHO also recommends the addition of more targeted and reactive interventions (WHO Guidelines for malaria, 31 March 2022) (5). Our changes to the title of the article (“Resource allocation strategies for insecticide-treated bednets to achieve malaria eradication”) should now better reflect that we consider ITNs as just one necessary component to achieve malaria eradication.

      Reviewer #2 (Public Review):

      1. Schmit et al. analyze and compare different strategies for the allocation of funding for insecticide-treated nets (ITNs) to reduce the global burden of malaria. They use previously published models of Plasmodium falciparum and Plasmodium vivax malaria transmission to quantify the effect of ITN distribution on clinical malaria numbers and the population at risk. The impact of different resource allocation strategies on the reduction of malaria cases or a combination of malaria cases and achieving pre-elimination is considered to determine the optimal strategy to allocate global resources to achieve malaria eradication.

      Strengths:

      Schmit et al. use previously published models and optimization for rigorous analysis and comparison of the global impact of different funding allocation strategies for ITN distribution. This provides evidence of the effect of three different approaches: the prioritization of high-transmission settings to reduce the disease burden, the prioritization of low-transmission settings to "shrink the malaria map", and a resource allocation proportional to the disease burden.

      Thank you for providing this summary and outline of the strengths of the paper.

      1. Weaknesses:

      The analysis and optimization which provide the evidence for the conclusions and are thus the central part of this manuscript necessitate some simplifying assumptions which may have important practical implications for the allocation of resources to reduce the malaria burden. For example, seasonality, mosquito species-specific properties, stochasticity in low transmission settings, and changing population sizes were not included. Other challenges to the reduction or elimination of malaria such as resistance of parasites and mosquitoes or the spread of different mosquito species as well as other beneficial interventions such as indoor residual spraying, seasonal malaria chemoprevention, vaccinations, combinations of different interventions, or setting-specific interventions were also not included. Schmit et al. clearly state these limitations throughout their manuscript.

      The focus of this work is on ITN distribution strategies, other interventions are not considered. It also provides a global perspective and analysis of the specific local setting (as also noted by Schmit et al.) and different interventions as well as combinations of interventions should also be taken into account for any decisions.

      Thank you for raising these points. As outlined at the beginning of our response, for computational reasons we indeed had to introduce several simplifying assumptions to perform this complex optimisation problem. As a result of these factors you highlighted, our study should primarily be interpreted as a thought experiment to assess whether current policies are aligned with an optimal allocation strategy or whether there might be a need to consider alternative strategies. The findings are relevant primarily to global funders and should not be used to inform individual country allocation decisions, which we have further clarified in the manuscript.

      1. Nonetheless, the rigorous analysis supports the authors' conclusions and provides evidence that supports the prioritization of funding of ITNs for settings with high Plasmodium falciparum transmission. Overall, this work may contribute to making evidence-based decisions regarding the optimal prioritization of funding and resources to achieve a reduction in the malaria burden.

      Thank you for this positive assessment of our work.

      Reviewer #1 (Recommendations For The Authors):

      1. L144: last paragraph, the focus on endemic equilibrium: I did not really understand this, when 39 years is mentioned later is that a different analysis? How are cases averted calculated in a time-agnostic endemic equilibrium analysis? Perhaps a little more detail here would be helpful.

      A further explanation of this has been added in the results and methods.

      p. 8 L 22: “To evaluate the robustness of the results, we conducted a sensitivity analysis on our assumption on ITN distribution efficiency. Results remained similar when assuming a linear relationship between ITN usage and distribution costs (Figure S10). While the main analysis involves a single allocation decision to minimise long-term case burden (leading to a constant ITN usage over time in each setting irrespective of subsequent changes in burden), we additionally explored an optimal strategy with dynamic re-allocation of funding every 3 years to minimise cases in the short term.”

      p. 17 L25: “To ensure computational feasibility, 39 years was used as it was the shortest time frame over which the effect of re-distribution of funding from countries having achieved elimination could be observed.”

      p. 18 L 9: “Global malaria case burden and the population at risk were compared between baseline levels in 2000 and after reaching an endemic equilibrium under each scenario for a given budget.”

      1. L148: what is proportional allocation by disease burden and how is that different from prioritizing high-transmission settings?

      Further details have been added in the text.

      p. 6 L12: “Strategies prioritizing high- or low-transmission settings involved sequential allocation of funding to groups of countries based on their transmission intensity (from highest to lowest EIR or vice versa). The proportional allocation strategy mimics the current allocation algorithm employed by the Global Fund: budget shares are mainly distributed according to malaria disease burden in the 2000-2004 period. To allow comparison with this existing funding model, we also started allocation decisions from the year 2000.”

      1. L198-9: did low transmission settings get the majority of funding at intermediate and maximum budgets because they have the most population (I think so, based on Fig 1)?

      Yes, this is correct. We state in the results: “the optimized distribution of funding to minimize clinical burden depended on the available global budget and was driven by the setting-specific transmission intensity and the population at risk”.

      1. L206: what is ITN distribution efficiency? This is not explained. What is the 39-year period? Why this duration?

      Further explanations have been added in the results section, which were previously only detailed in the methods:

      p. 8 L 22: “To evaluate the robustness of the results, we conducted a sensitivity analysis on our assumption on ITN distribution efficiency. Results remained similar when assuming a linear relationship between ITN usage and distribution costs (Figure S10)."

      p. 17 L25: “To ensure computational feasibility, 39 years was used as it was the shortest time frame over which the effect of re-distribution of funding from countries having achieved elimination could be observed.”

      1. L218: what is "no intervention with a high budget"? is this a phrasing confusion?

      Yes, this has been changed.

      p. 9 L14: “We estimated that optimizing ITN allocation to minimize global clinical incidence could, at a high budget, avert 83% of clinical cases compared to no intervention.”

      1. L235-7: on comparing these results to previous work on the 20 highest-burden countries: is the definition of "high" similar enough across these studies that this is a relevant comparison?

      We believe this is reasonably comparable, as looking at the 20 highest-burden countries encompasses almost the entire high-transmission group in our work (25 countries in total), on which the comparison is made.

      1. L267-70: I didn't understand this sentence at all.

      Thanks for flagging this. The sentence referred to is: “Allocation proportional to disease burden did not achieve as great an impact as other strategies because the funding share assigned to settings was constant irrespective of the invested budget and its impact, and we did not reassign excess funding in high-transmission settings to other malaria interventions.”

      The previously mentioned added details on the proportional allocation strategy in the manuscript should now make this clearer, together with this clarification:

      p. 11 L17: “In modelling this strategy, we did not reassign excess funding in high-transmission settings to other malaria interventions, as would likely occur in practice.”

      For proportional allocation, a fixed proportion of the budget is calculated for each country based on disease burden, as described in the Global Fund allocation documentation (see Methods). However, since ITNs are the only intervention considered, this leads to a higher budget being allocated than is needed in some countries (i.e. where more funding doesn’t translate into further health gains).

      1. L339 EIR range: 80 is high at the country level but areas within countries probably went as high as 500 back in 2000. How does this affect the modeled estimates of ITN impact?

      The question of sub-national differences in transmission has been addressed in the public review comments. Briefly, we consider the national scale to be most relevant for our analysis because this is the scale at which global funding allocation decisions are made in practice. Although, as you correctly point out, the EIR affects ITN impact, it is not possible to conclude what the average effect of this would be on the country level without considering the following factors and introducing further assumptions on these: how would different countries distribute funding sub-nationally? Which countries would take cooperative or competitive approaches to tackle malaria within a region or in border areas? Would there be delays in the allocation of bednets in specific regions? These interesting questions were outside of the scope of this work, but certainly require further investigation.

      1. L347 population size constant: births and deaths are still present, is that right? Unclear from this sentence

      Yes, this is correct. Full details on the model can be found in the Supplementary Materials.

      1. L370 estimating ITN distribution required to achieve simulated population usage: is this a single relationship for all of Africa? Is it based on ITNs distributed 2:1 -> % access -> % usage? So it accounts for allocation inefficiency?

      Yes, this is represented by a single relationship for all of Africa to account for allocation inefficiency and is based on observed patterns across the continent and methodology developed in a previous publication (Bertozzi-Villa et al., 2021, Nature Communications) (7). Full details can be found in the Supplementary Materials (“Relationship between distribution and usage of insecticide-treated nets (ITNs)”, p. 21).

      1. L375: the ITN unit cost is assumed constant across countries and time (I think, it doesn't say explicitly), is this a good assumption?

      Yes, this is correct. We consider this a reasonable assumption within the scope of the paper. While delivery costs likely vary across countries, international funders usually have pooled procurement mechanisms for ITNs (The Global Fund, 2023, Pooled Procurement Mechanism Reference Pricing: Insecticide-Treated Nets).

      1. L399: "single allocation of a constant ITN usage" it is not explained what exactly this means

      Further explanations have been added in the manuscript.

      p. 8 L24: “While the main analysis involves a single allocation decision to minimise long-term case burden (leading to a constant ITN usage over time in each setting irrespective of subsequent changes in burden), we additionally explored an optimal strategy with dynamic re-allocation of funding every 3 years to minimise cases in the short term.”

      Reviewer #2 (Recommendations For The Authors):

      1. Additionally to the public comments, the only major comment is that in this reviewer's opinion, the focus on ITNs as the only intervention should be made clearer at different places in the manuscript (e.g. in the discussion lines 303-304). Otherwise, there are only some minor comments (see below).

      We have now modified the following sentence and also included this suggestion in the title (“Resource allocation strategies for insecticide-treated bednets to achieve malaria eradication”).

      p. 13 L8: “Our analysis demonstrates the most impactful allocation of a global funding portfolio for ITNs to reduce global malaria cases.”

      1. Minor comments:
      2. It may be of interest to compare the maximum budget obtained from the optimization with other estimates of required funding and actual available funding.

      Thank you for this interesting suggestion. Our maximum budget estimates are similar to the required investments projected for the WHO Global Technical Strategy: US$3.7 billion for ITNs in our analysis compared to between US$6.8 and US$10.3 billion total annual resources between 2020 and 2030, of which an estimated 55% would be required for (all) vector control (US$3.7 - US$5.7 billion) (Patouillard et al., 2016, BMJ Global Health) (8). However, it is well known that current spending is far below these requirements: total investments in malaria were estimated to be about US$3.1 billion per year in the last 5 years (World Health Organization, 2022, World Malaria Report 2022) (9).

      1. Line 177: should "Figure S7" be bold?

      Yes, this has been corrected.

      1. Line 218: what does "no intervention with high budget" mean? Should this simply be "no intervention"?

      This has been changed.

      p. 9 L14: “We estimated that optimizing ITN allocation to minimize global clinical incidence could, at a high budget, avert 83% of clinical cases compared to no intervention.”

      1. In this reviewer's opinion it would be easier for the reader if the weighting term in the objective function would be added in the Materials and Methods section. The weighting could be added without extending the section substantially and the explanation in lines 390-393 may be easier to understand.

      Thank you for this suggestion. We agree and have added this in the main manuscript.

      References

      1. The Global Fund. Description of the 2020-2022 Allocation Methodology 2019 [Available from: https://www.theglobalfund.org/media/9224/fundingmodel_2020-2022allocations_methodology_en.pdf.

      2. Hay SI, Guerra CA, Tatem AJ, Noor AM, Snow RW. The global distribution and population at risk of malaria: past, present, and future. Lancet Infect Dis. 2004;4(6):327-36.

      3. Feachem RGA, Phillips AA, Hwang J, Cotter C, Wielgosz B, Greenwood BM, et al. Shrinking the malaria map: progress and prospects. The Lancet. 2010;376(9752):1566-78.

      4. Sherrard-Smith E, Winskill P, Hamlet A, Ngufor C, N'Guessan R, Guelbeogo MW, et al. Optimising the deployment of vector control tools against malaria: a data-informed modelling study. The Lancet Planetary Health. 2022;6(2):e100-e9.

      5. World Health Organization. WHO Guidelines for malaria, 31 March 2022. Geneva: World Health Organization; 2022. Contract No.: Geneva WHO/UCN/GMP/ 2022.01 Rev.1.

      6. Griffin JT, Bhatt S, Sinka ME, Gething PW, Lynch M, Patouillard E, et al. Potential for reduction of burden and local elimination of malaria by reducing Plasmodium falciparum malaria transmission: a mathematical modelling study. The Lancet Infectious Diseases. 2016;16(4):465-72.

      7. Bertozzi-Villa A, Bever CA, Koenker H, Weiss DJ, Vargas-Ruiz C, Nandi AK, et al. Maps and metrics of insecticide-treated net access, use, and nets-per-capita in Africa from 2000-2020. Nature Communications. 2021;12(1):3589.

      8. Patouillard E, Griffin J, Bhatt S, Ghani A, Cibulskis R. Global investment targets for malaria control and elimination between 2016 and 2030. BMJ global health. 2017;2(2):e000176.

      9. World Health Organization. World malaria report 2022. Geneva: World Health Organization; 2022. Report No.: 9240064893.

    1. Author Response:

      We take the liberty to thank all of you for your constructive and inspiring comments, which will help us substantially improve the final version of the paper. Before our final revision with details, I am writing this provisional letter to have a quick response to our reviewers’ comments.

      I first give a quick and short summary for your public reviews, then respond point-by-point.

      Editors:

      1. More discussion is needed.

      2. More discussion about eye fixation during adaptation. Discuss why increasing visual uncertainty by blurring the cursor in the present study produces the opposite findings of previous studies (Tsay et al., 2021; Makino et al., 2023).

      3. Discuss the broad impact of the current model.

      4. Share the codes and the metadata (instead of the current data format).

      Response: This is a concise summary of the major concerns listed in the public review. Given these concerns are easy to address, we are giving a quick but point-to-point response for now. The elaborate version will be put into our formal revision.

      **Reviewer 1: **

      1) More credit should be given to the PReMo model: a) The PReMo model also proposes that perceptual error drives implicit adaptation, as in a new publication in Tsay et al., 2023, which was not public at the time of the current writing; and b) The PReMo model can account for some dataset, e.g. Fig 4A.

      Response: We will add this new citation and point out that the new paper also uses the term perceptual error. We will also point out that the PReMo model has the potential to explain Fig 4A, though for now, it assumes an additional visual shift to explain the positive proprioceptive changes relative to the target. We would expand the discussion about the comparison between the two models.

      2) The present study produced an opposite finding of a previous finding, i.e., upregulating visual uncertainty (by cursor blurring here) decreases adaptation for large perturbations but less so for small perturbations, while previous studies have shown the opposite (by using a cursor cloud; Tsay et al., 2021; Makino et al., 2023). This needs explanation.

      Response: Using the cursor cloud (Tsay et al., 2021, Makino et al., 2023) to modulate visual uncertainty has inherent drawbacks that make it unsuitable for testing the sensory uncertainty effect for visuomotor rotation. For the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) AND the sensory estimate of angles (the error itself, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (sigma_v in our model), but it additionally affects the mean of the distribution (mu). This unnecessary confound is avoided by using cursor blurring, which is still a cursor with its center (mu) unchanged from an un-blurred cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2021, the cursor cloud often overlaps with the visual target. This “target hit” would affect adaptation, possibly via a reward learning mechanism (See Kim et al., 2019 eLife). This is a second confound that accompanies the cursor cloud. We will expand our discussion to explain the discrepancy between our findings and previous findings.

      3) The estimation of visual uncertainty (our exp1) required people to fixate on the target, while this might not reflect the actual scenario during adaptation where people are free to look wherever they want.

      Response: Our data shows otherwise: in a typical error-clamp setting, people fixate on the target for the majority of the time. For our Exp1, the fixation on the straight line between the starting position and the target is 86%-95% (as shown in Figure S1). We also collected eye-tracking data in our Exp4, which is a typical error-clamp experiment. More than 95% of gaze falls with +/- 50 pixels around the center of the screen, even slightly higher than Exp1. We will provide this part of the data in the revision. In fact, we designed our Exp1 to mimic the eye-tracking pattern as in typical error-clamp learning with carefully executed pilot experiments.

      This high percentage of fixating on the target is not surprising: the error-clamp task requires participants to use their hands to move towards the target and to ignore the cursor. In fact, we would also like to point out that the high percentage of fixation on the aiming target is also true for conventional visuomotor rotation, which involves strategic re-aiming (shown in de Brouwer et al. 2018; Bromberg et al. 2019; we have an upcoming paper to show this). This is one reason that our new theory would also apply to other types of motor adaptation.

      4) More methodology details are needed. E.g., a figure showing the visual blurring, a figure showing individual data, a table showing data from individual sessions, code sharing, and a possible new correlational analysis.

      Response: All these additional methodological/analysis information will be provided. We were self-limited by writing a short paper, but the revision would be extended for all these details.

      Reviewer 2:

      1) More discussions are needed since the focus of this study is narrowly confined to visuomotor rotation. “A general computational principle, and its contributions to other motor learning paradigms remain to be explored”.

      Response: This is a great suggestion since we also think our original Discussion has not elaborated on the possible broad impact of our theory. Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating multimodal feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least we should consider using the two principles above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for the skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.

      We will add more discussion on the possible broad implications of our model in the revision.

      Reviewer 3:

      1) Similar to Reviewer 1, raised the concern about whether people’s fixation in typical motor adaptation settings is similar to the fixation that we instructed in our Exp1.

      Response: see above.

      2) Similar to Reviewer 2, the concern was raised about whether our new theory is applicable to a broad context. Especially, error clamp appears to be a strange experimental manipulation that has no real-life appeal, “(i)Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world”.

      Response: about the broad impact of our model, please see responses to Reviewer 2 above. We agree that ignoring errors (and thus “trying” to suppress adaptation) should not be a movement strategy for real-world intentional tasks. However, even in real life, we constantly attend to one thing and do the other thing; that’s when implicit motor processes are in charge. Furthermore, it is this exact “ignoring” instruction that elicits the implicit adaptation that we can work on. In this sense, the error-clamp paradigm is a great vehicle to isolate implicit adaptation and allows us to unpack its cognitive mechanism.

      3) In Exp1, the 1s delay between the movement end and the presentation of the reference cursor might inflate the actual visual uncertainty.

      Response: The 1s delay of the reference cursor would not inflate the estimate of visual uncertainty. Our Exp1 used a similar paradigm by visual science (e.g., White, Levi, and Aitsebaomo, Vision Research, 1992), which shows that delay does not lead to an obvious increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6). We will add more methodology justifications in our revision.

      4) Our Fig4A used Tsay et al., 2021 data, which, in the reviewer’s view, is not an appropriate measure of proprioceptive bias. The reason is that in this dataset, “participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to.”

      Response: We agree that Tsay et al., 2021 study used an unconventional way to measure the influence of implicit adaptation on proprioception. And, their observed “proprioceptive changes” should not be called “proprioceptive bias” which is conventionally a reserved term for measuring the difference between the estimated hand location relative to the actual hand location (and better to be a passively moved hand). However, we think their dataset is still subject to the same Bayesian cue combination principle and thus can be modeled. Our modeling of this dataset includes all relevant cues: the implicitly perceived hand position and the proprioceptive cue (given that the hand stays at the movement end). Both cues are in the extrinsic coordinates, which happened to set the target position as zero. But where to set the zero (whether it is the target or the actual hand location) does not matter for the model fitting. Note that our Exp4 is also based on PEA modeling of proprioceptive bias, and this time the data is presented relative to the actual location.

      In the revision, we would keep the current Fig4A and start to call the data as proprioceptive change as opposed to proprioceptive bias to follow the convention.

    1. eLife assessment

      This important study expands our understanding of the role of two axon guidance factors in a specific axon guidance decision. The strength of the study is the convincing axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ.

      Strengths:<br /> Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling.<br /> The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ.

      Weaknesses:<br /> No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, the authors conduct a detailed analysis of the molecular cues that control the guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single-neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos support the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing additional confirmation of the requirement for both guidance pathways.

      Strengths:<br /> This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

      Weaknesses:<br /> There are some places in the text where the discussion of these data is compared with other studies and models, but additional details would help clarify the arguments.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In this paper, Curran et al investigate the role of Ntn, Slit1, and Slit 2 in the axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. However, there are some potential areas of the manuscript that should be edited to better match the results with the conclusions of the work.

      Strengths:<br /> The manuscript uses the advantage of mouse genetics to investigate the axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1, and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight.

      Weaknesses:<br /> Overall, the manuscript is sound in technique and analysis. However, the majority of the manuscript is about the dorsal funiculus and not the bifurcation of the axons, as the title would make a reader believe. Further, the manuscript would provide a more scholarly discussion of the current knowledge of DRG axon patterning and how their work fits into that knowledge.

    1. eLife assessment

      This study delves into the complex role of STAT3 signaling and its interplay with TGF-beta and SMAD4 in KRAS mutant pancreatic cancer. The authors demonstrate that both the presence and absence of STAT3, relative to SMAD4, can lead to poor PDAC differentiation and that STAT3 mutations affect p53-null fibroblasts with KRASG12D and induce an EMT-like phenotype. By providing convincing evidence, the authors were able to derive important insights into KRAS mutant cancers.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This study presents a valuable finding on the increased activity of two well-studied signal transduction pathways - STAT-3 and TGF-Beta in a specific subtype of pancreatic cancer. Specifically, SMAD4 deficient tumors (commonly observed in pancreatic cancer) are well differentiated in the presence of STAT3. Yet surprisingly, in the presence of SMAD4 in a STAT-3 deficient pancreatic cancer, the phenotype is poorly differentiated in the background of KRASGD12D. The evidence in the animal models supporting the authors' claims is solid, although including TCGA data and/or a larger number of patients would have strengthened the study. The work will be of interest to medical biologists working on pancreatic cancer and potentially the broader field.

      Strengths:<br /> Strengths are the animal models and the lead author's expertise in STAT3 signaling.

      Weaknesses:<br /> Weaknesses are the absence of correlation between the results from the animal studies and human pancreatic cancers.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript explores mechanisms by which STAT3 may regulate KRAS mutant cancers.

      In the first set of experiments, STAT3 GOF mutants diminished the transformation of p53-null mouse embryonic fibroblasts expressing endogenous mutant KRAS(G12D) (KP MEFs) and this was dependent on direct transcriptional activation induced by phosphorylated STAT3. It appears that this is mediated via a reduction in TGFb signaling such that knockout of either TGFBR2 or SMAD4 can phenocopy the effects of STAT3 GOF mutants in KP MEFs.

      In the next part of the paper, the authors used murine pancreatic ductal adenocarcinoma (PDAC)-derived cell lines bearing endogenous KRAS(G12D) and TP53(R172H) mutations (KPC) to determine the extent to which STAT3 may regulate KRAS dependency. They determined that KRAS and STAT3 KO both induced mesenchymal-like phenotypes and that TGFBR2 and SMAD4 KO induced epithelial phenotypes. The loss of STAT3 appeared to correlate with a KRAS-independent signature, and SMAD4/TGFBR2 KO could not induce epithelial phenotypes when STAT 3 was also knocked out.

      Strengths:<br /> Overall, this is an interesting paper that highlights the complicated interactions between KRAS, STAT3, and TGF beta signaling. The authors use multiple models and attempt to link data to patient cohorts.

      Weaknesses:<br /> While correlations are strong, the study would benefit from additional cause-and-effect type experiments. It would also be beneficial to better tie together the first and second parts of the paper.

    1. eLife assessment

      This manuscript presents a detailed characterization of male and female wild-type and CTRP10 knockout mice, revealing that knockout mice develop female-specific obesity that is largely uncoupled from metabolic dysfunction. The data are convincing, and the work is a valuable contribution to understanding how obesity is coupled to metabolic dysfunction, and how this can occur in a sex-specific manner.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript by Chen et al. presents a detailed metabolic characterization of male and female WT and CTRP10 knockout mice. The main finding is that female KO mice become obese on both low-fat and high-fat diets but without evidence of marked insulin resistance, hepatic steatosis, dyslipidemia, or increased inflammatory markers. The authors performed a detailed transcriptomic analysis and identified differentially expressed genes that distinguish high-fat diet-fed CTRP10 KO from WT control mice. They further show that this set of genes exhibits cross-correlation in human tissues, and that this is greater in females than in males. The data indicate that the CTRP10 KO model may be useful to understand how obesity and metabolic dysfunction are coupled to each other, and how this occurs by a sex-biased mechanism.

      Strengths:<br /> The work presents a large amount of data, which has been carefully acquired and is convincing. The transcriptomic analysis will further help to define what pathways are associated with obesity, but not necessarily with metabolic dysfunction. The manuscript will be of interest to investigators studying metabolic diseases, and to those studying sex-specific differences in metabolic physiology. The limitations of the study are acknowledged, including that a whole-body knockout was used. The cause of the increased body weight is not entirely clear, despite the careful and detailed analysis that was performed. Notwithstanding these limitations, the phenotype is interesting, and this work will establish a basis for further work to understand the mechanisms that are involved.

      Weaknesses:<br /> Genes identified as DEGs in the mouse RNAseq data set were used to identify a set of human orthologous transcripts and the abundances of these transcripts were correlated with each other in Figure 10. This identified a greater correlation ("connectivity") in subQ adipose compared to other tissues, and in females compared to males. The description of how this analysis was done could be clearer. In some cases, the text refers to the software that was used without describing the goal of the analysis. In other instances, specialized terminology was used (e.g. "biweight midcorrelation") without defining what this means.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In the current study, the authors investigated the role of loss of CTRP10 results in female obesity with preserved metabolic health. The overall conclusion is supported by the experimental data that CTRP10 negatively regulates body weight in females and that loss of CTRP10 results in benign obesity with largely preserved insulin sensitivity and metabolic health. The authors have shown the role of sex differences in the metabolically healthy obese (MHO) phenotype, which may increase the scope for research in this area.

      Strengths:<br /> The study provides a detailed idea of how genes are regulated in a sex-dependent manner.

      Weaknesses:<br /> Mechanistic details are missing.

    4. Reviewer #3 (Public Review):

      Summary:<br /> This study examines the impact of CTRP10/C1QL2 absence on obesity and metabolic health in mice. Female mice lacking CTRP10 tend to develop obesity, particularly on a high-fat diet. Surprisingly, they do not display the typical metabolic traits associated with obesity, like fatty liver or glucose intolerance. This indicates a disconnection between weight gain and metabolic issues in these female mice. The research underscores the need to understand sex-specific factors in how obesity influences metabolic health.

      Strengths:<br /> The study provides compelling evidence regarding Ctrp10's role in female-specific metabolic regulation in mice, shedding light on its potential significance in metabolically healthy obese (MHO) individuals.

      Weaknesses:<br /> -The analysis and description of sex-specific human data require more details to highlight the relevance of Ctrp10 mouse data and the analysis of differentially expressed genes in humans.<br /> -There's a lack of analysis regarding secreted Ctrp10 under various dietary conditions.<br /> -The study didn't assess adipose tissue function to evaluate metabolic health.

    1. eLife assessment

      This study presents an important method and resource in cell lines and in mice for mass spectrometry-based identification of interactors of the proteasome, a multi-protein complex with a central role in protein turnover in almost all tissues and cell types. The method presented-including the experimental workflow and analysis pipeline, as well as the several lines of validation provided throughout, is convincing. Given the growing interest in protein aggregation and targeted protein degradation modalities, this work will be of interest to a broad spectrum of basic cell biologists and translational researchers.

    2. Reviewer #1 (Public Review):

      Summary:

      Bartolome et al. report adaptation of proximity labeling using BirA and TurboID fusions to proteasome subunits to identify the proteasome-proximal proteome both in cultured cells and also in a newly developed mouse model. Using this approach, the authors demonstrate identification of many known proteasome-interacting proteins, as well as several new proteins, some of which are validated directly. The authors further evaluate the proteasome-proximal proteome in most mouse organs, and find substantial agreement with the proteome identified from cultured cells, as well as between tissues. This represents one of the first studies of the "proteasome-ome" in vivo, and sets the stage for addressing numerous important future questions regarding how the proteasome's environment changes over time, in response to different stimuli, and in distinct disease conditions.

      Strengths:

      Generally speaking, the approach provided is rigorous and supported by several complementary lines of evidence, such as demonstration that the interactome is enriched for known proteasome-binding proteins and co-purification or co-elution experiments. Similarly, the high agreement between the outcomes in cultured cells and in the mouse model developed by the authors provides further confidence in the results.

      Weaknesses:

      The major weakness of the work is arguably the choice of proteasome subunits for tagging with biotinylating enzymes. In most cases, the subunits and termini chosen for tagging are known to either protrude toward functionally important regions (such as the substrate-processing pore of the ATPase component), to have important functional roles likely to be disrupted via tagging, or are subunits known to be substituted by others in some conditions. Thus, the interactome reported may conflate those of normal proteasomes with those harboring tag-induced functional or structural defects. Although the authors made a commendable attempt to demonstrate minimal impacts of tagging, the conclusions would be greatly further strengthened by contrasting the impacts of tagging subunits less likely to cause perturbations and by more rigorously demonstrating normal proteolysis of a broader array of known proteasome substrates.

    3. Reviewer #2 (Public Review):

      Summary

      In this work, Bartolome and colleagues develop a new approach to identify proteasome interacting proteins and substrates. The approach is based on fusing proteasome subunits with a biotin ligase that will label proteins that come in close physical distance of the ligase. These biotin-labeled proteins (or their resulting tryptic peptides) can be affinity purified using streptavidin and identified by mass spectrometry.

      This elegant solution was able to identify a large proportion of known proteasome interactors, as well as multiple potential new interactors. Combining this approach with a proteasome inhibitor allowed also for the enrichment of substrates, due to increased contact time between substrates and the proteasome. Again, the authors were able to identify novel substrates. Finally, the authors implemented this strategy in vivo, providing the hints for potential tissue-specific proteasome interactors.

      This novel strategy provides an additional approach to identify new proteasome substrates, which can be particularly powerful for low abundant proteins, e.g., transcription factors. The possibility to implement it in vivo in specific cell types opens the possibility for identifying proteasome interactors in small cell subpopulations or in subpopulations involved in disease.

      Strengths:

      The authors carefully characterized their genetically engineered proteasome-biotin ligase fusions to ensure that proteasome structure and activity was not altered. This is key to ensure that the proteins identified to interact with the proteasome reflect interactions that occur under physiological conditions.

      The authors implemented an algorithm that controls the false positive rate of the identified interactors of the proteasome. This is an important aspect to avoid spending time on the characterization of potential interactors that are just an artifact of the experimental setup.

      The addition of a proteasome inhibitor allowed the authors to identify substrates of the proteasome. Although there are other strategies to do this (e.g., affinity purification of Gly-Gly modified peptides, which is a marker for ubiquitination), this additional approach can highlight currently unknown substrates. One example are low abundance proteins, such as transcription factors.

      The overall strategy developed by the authors can be implemented in vivo, which opens for the possibility of determining cell type-specific proteasome interactors (and perhaps substrates).

      Weaknesses:

      There is a small proportion of the PSMA4-biotin ligase fusion that remains unassembled (i.e., not part of the functional proteasome) and that can contribute to a small proportion of false positive interactions.

    4. Reviewer #3 (Public Review):

      Summary:

      Bartolome et al. present ProteasomeID, a novel method to identify components, interactors, and (potentially) substrates of the proteasome in cell lines and mouse models. As a major protein degradation machine that is highly conserved across eukaryotes, the proteasome has historically been assumed to be relatively homogeneous across biological scales (with few notable exceptions, e.g., immunoproteasomes and thymoproteasomes). However, a growing body of evidence suggests that there is some degree of heterogeneity in the composition of proteasomes across cell tissues, and can be highly dynamic in response to physiologic and pathologic stimuli. This work provides a methodological framework for investigating such sources of variation. The authors start by adapting the increasingly popular biotin ligation strategy for labelling proteins coming into close proximity with one of three different subunits of the proteasome, before proceeding with PSMA4 for further development and analysis based on their preliminary labelling data. In a series of well-constructed and convincing validation experiments, the authors go on to show that the tagged PSMA4 construct can be incorporated into functional proteasomes, and is able to label a broad set of known proteasome components and interacting proteins in HEK293T cells. They also attempt to identify novel proteasomal degradation substrates with ProteasomeID; while this was convincing for known substrates with particularly short half-lives (exemplified by the transcription factor c-myc), follow-up validation experiments with other substrates were less clear. One of the most compelling results was from a similar experiment to confirm proteasomal degradation induced by a BRD-targeting PROTAC, which I think is likely to be of keen interest to the targeted degradation community. Finally, the authors establish a ProteasomeID mouse model, and demonstrate its utility across several tissues.

      Strengths:

      1) ProteasomeID itself is an important step forward for researchers with an interest in protein turnover across biological scales (e.g., in sub-cellular compartments, in cells, in tissues, and whole organisms). I especially see interest from two communities: those studying fundamental proteostasis in physiological and pathologic processes (e.g., ageing; tissue-specific protein aggregation diseases), and those developing targeted protein degradation modalities (e.g., PROTACs; molecular glues). All the datasets generated and deposited here are likely to provide a rich resource to both. The HEK293T cell line data are a valuable proof-of-concept to allow expansion into more biologically-relevant cell culture settings; however, I envision the greatest innovation here to be the mouse model. For example, in the targeted protein degradation space, two major hurdles in early-stage pre-clinical development are (i) evaluation of degradation efficacy across disease-relevant tissues, and (ii) toxicity and safety implications caused by off-target degradation, e.g., of newly-identified molecular glues and/or in particularly-sensitive tissues. The ProteasomeID mouse allows early in vivo assessment of both these questions. The results of the BRD PROTAC experiment in 293T cells provides an excellent in vitro proof-of-concept for this approach.

      2) The mass spectrometry-based proteomics workflows used and presented throughout the manuscript are robust, rigorous, and convincing. For example, the algorithm the authors use for defining enrichment score cut-offs are logical and based on rational models, rather than on arbitrary cut-offs that are common for similar proteomics studies. The construction (and subsequent validation) of both BirA*- and miniTurbo- tagged PSMA4 variants also increases the utility of the method, allowing researchers to choose the variant with the labelling time-scale required for their particular research question.

      3) The optimised BioID and TurboID protocol the authors develop (summarised in Fig. S2A) and validate (Fig. S2B-D) is likely to be of broad interest to cell and molecular biologists beyond the protein degradation field, given that proximity labelling is a current gold-standard in global protein:protein interaction profiling.

      Limitations:

      I think the authors do an excellent job in highlighting the limitations of ProteasomeID throughout the Results and Discussion. I do have some specific comments that might provide additional context for the reader.

      1) The authors do a good job in showing that a substantial proportion of PSMA4-BirA* is incorporated into functional proteasome particles; however, it is not immediately clear to me how much background (false-positive IDs) might be contributed by the ~40 % of PSMA4-BirA* that is not incorporated into the mature core particle (based on the BirA* SEC-MS traces in Fig. 2b and S3b, i.e., the large peak ~ fraction 20). Are there any bands lower down in the native gel shown in Fig. 2c, i.e., corresponding to lower molecular weight complexes or monomeric PSMA4-BirA*? The enrichment of proteasome assembly factors in all the ProteasomeID experiments might suggest the presence of assembly intermediates, which might themselves become substrates for proteasomal degradation (as has been shown for other incompletely-assembled protein complexes, e.g., the ribosome, TRiC/CCT).

      2) Although the authors attempt to show that BirA* tagging of PSMA4 does not interfere with proteasome activity (Fig. 2e-f), I think the experimental evidence for this is incomplete. They show that the overall chymotrypsin-like activity (attributable to PSMB5) in cells expressing PSMA4-BirA* is not markedly reduced compared with control BirA*-expressing cells. However, they do not show that the activity of the specific proteasome sub-population that contains PSMA4-BirA* is unaffected (e.g., by purifying this sub-population via the Flag tag). The proteasome activity of the sub-population of wild-type proteasome complexes that do not contain the PSMA4-BirA* (~50%, based on the earlier immunoblots) could account for the entire chymotrypsin-like activity-especially in the context of HEK293T cells, where steady-state proteasome levels are unlikely to be limiting. It would also be useful to assess any changes in tryspin- and caspase- like activities, especially as tagging of PSMA4 could conceivably interfere with the activity of some PSMB subunits, but not others.

      3) I was left unsure of the general utility of ProteasomeID for identifying novel proteasomal substrates in homeostatic or stressed conditions. The immunoblots for the two candidates the authors follow up in Fig. 4g was not especially clear; the reduction in the bands are modest, at best. Furthermore, classifying candidates based on enrichment following proteasome inhibition with MG-132 have the potential to lead to a high number of false positives. ProteasomeID's utility in identifying potential substrates in more targeted settings (e.g., molecular glues, off-target PROTAC substrates) is far more apparent.

    1. eLife assessment

      This useful study investigates the role of the centrosomal protein CEP44 in centriole duplication and mitotic spindle formation. While the analysis of CEP44 mitotic phosphorylation and spindle recruitment is solid, the characterization of CEP44's role at centrioles is incomplete and would benefit from additional controls and analyses. Since the work links CEP44 reduced expression to poor survival in breast cancer patients, it is of interest not only to cell biologists but also to cancer researchers.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Zhang et al. describe novel roles for the centriolar protein CEP44, namely that it is required for centriole engagement (and thus inhibition of centriole reduplication) and that it promotes microtubule stability. While a function of CEP44 in centriole engagement is somehow convincingly shown, the data do not support a role for CEP44 in microtubule stabilization.

      Strengths:<br /> The finding that centriole engagement relies on CEP44 is novel and of great interest to the centriole field. Interestingly, the authors correlate reduced CEP44 expression levels with the occurrence of breast carcinoma, which makes this study also very interesting for a broad audience.

      Weaknesses:<br /> The paper has important findings, but unfortunately, the main claims are only partially supported.

      1) The role of CEP44 in microtubule stability is not clear from the presented data:<br /> - Fig. 7A and S6 A, there is no visible difference in microtubule density/intensity between the different groups of cells. In Fig. 7C, the CEP44 S324A spindle looks even brighter than the WT spindle. The authors need to indicate how many cells were analyzed. This information is actually lacking in all the experiments.

      2) Several figure parts are not properly labelled.

      3) Several of the experiments (WBs) likely miss proper controls: How did the authors detect proteins that run at very similar sizes: 55 kDa (alpha-tubulin), 44 kDa (Cep44), and 57 kDa (Cep57 and Cep57L)? The loading control needs to be detected in the same lane as the protein of interest. Did the authors strip and reprobe membranes? If so, this needs to be indicated and included in the methods section.

      4) It is not clear how such a low CEP44-FLAG expression (Fig. 5A) can rescue a CEP44 KO.

    3. Reviewer #2 (Public Review):

      Zhang and Wei, et al. investigated the role of a centrosomal protein, CEP44, in regulating centrosomes and spindle integrity, with a focus on processes that may be dysregulated in breast cancer. The authors found that a breast cancer cell line, MDA-MB-436, lacks CEP44 protein and has amplified centrioles. CEP44 expression is reduced in samples from breast cancer patients. By super-resolution microscopy, the authors localize CEP44 to the proximal inner lumen of centrioles, as has also been previously shown by another group (Atorino et al 2020). Next, the authors investigate the role of CEP44 in centrosome regulation. They found that loss of CEP44 in HeLa cells results in extra puncta of CEP97 or Centrin-3, while ectopic overexpression of CEP44 in MDA-MB-436 cells reduces the number of CEP97 foci. Only one of the excess puncta in a CEP44-depleted HeLa cell recruits CEP164 or ODF2, indicating that extra foci were not the result of cytokinesis failure. In G1, most (~80%) of CEP44-depleted cells have 2 centrin foci, while in G2, a small population (~20%) have more than 4 centrin foci, and gamma-tubulin is recruited in foci in G2. The authors were able to observe centriole disengagement and amplification using live cell imaging. The authors propose that CEP44 acts in regulating centriole engagement by recruiting CEP57 and CEP57L1 to centrioles. The authors made CEP44 knockout cell lines using CRISPR and found that loss of CEP44 results in multipolar spindles, correlated with an increase in centriole amplification. Finally, the authors investigate the role of CEP44 at the mitotic spindle. The authors find that CEP44 localizes to spindles and is phosphorylated by Aurora A at G2/M on Ser324. Phosphorylation of CEP44 is required for its proper distribution between centrosomes and the spindle and microtubule stability within both spindles and interphase microtubules. Together, these studies shed light on the roles of CEP44 within centrosomes and spindles and will be of interest to cell biologists and cancer biologists studying cell division and centrosomes.

      The conclusions of this paper are only partially supported. The analyses could be improved to address the concerns about the major conclusions.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript by Zhang et al. analyzes the function of the centrosomal protein CEP44 in centriole duplication and in the formation of the mitotic spindle. The first part addresses the role of CEP44 at centrioles. Using mostly RNAi-mediated depletion in cell lines and in some cases KO cells, the authors find increased centriole numbers in depleted cells and, based on quantification of centrioles stained with various centriole markers as well as live imaging, conclude that this is due to premature centriole disengagement and overduplication. The second part, which is largely independent of the first, focuses on the role of CEP44 in the mitotic spindle. The authors find that CEP44 is phosphorylated in mitosis in an Aurora A-dependent manner and identify the phosphorylation site, which controls CEP44 spindle localization and functions in maintaining spindle integrity.

      Strength:<br /> The manuscript makes the interesting observation that reduced expression of CEP44 is observed in breast cancer and correlated with poor survival in patients.<br /> The analysis of mitotic phosphorylation including the identification of the modified site and its role in spindle recruitment is interesting and useful.

      Weakness:<br /> The authors seem to largely ignore previously published work that contrasts with the findings presented in the current study. The previous work found a role of CEP44 in centriole formation and centrosome conversion and observed reduced centriole numbers in depleted cells, whereas the current study claims the opposite, a role in centriole engagement that leads to overduplication and increased centriole number in depleted cells. However, the supporting evidence is not strong enough, especially in light of the previous work. Considering that CEP44 depletion also disrupts mitosis, which could affect centriole numbers by failed segregation/division, a more careful analysis in synchronized cultures would be needed. Also, cell cycle analysis would be required to rule out cell cycle effects in CEP44-depleted cells, which could also explain altered centriole numbers. Moreover, the quality of the imaging is often not sufficient to support the claims.<br /> The second part is largely disconnected from the first and reads as if it was a separate study. There is no attempt to integrate both parts. For example, the second part seems to largely focus on normal bipolar spindles, even though the first part reveals multipolarity as a phenotype after CEP44 knockdown. It remains unclear if the spindle defects are due to centriole defects, defective spindle microtubule stability/organization, or both, and whether the centriole-localized or spindle-localized CEP44 is involved.

      Another weak aspect is that neither for RNAi nor for KO cells the authors show that CEP44 is depleted at centrioles and to what extent. This is only shown in cell extract.

    1. eLife assessment

      This is an important study on the damage-induced checkpoint maintenance and termination in budding yeast that provides convincing evidence for a role of the spindle assembly checkpoint and mitotic exit network in halting the cell cycle after prolonged arrest in response to irreparable DNA double strand breaks (DSBs). The study identifies particular components from both checkpoints that are specifically required for the establishment and/or the maintenance of a cell cycle block triggered by such DSBs. The authors propose an interesting model for how these different checkpoints intersect and crosstalk for timely resumption of cell cycling even without repairing DNA damage with theoretical and practical implications.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In their manuscript, Zhou et al. analyze the factors controlling the activation and maintenance of a sustained cell cycle block in response to persistent DNA DSBs. By conditionally depleting components of the DDC using auxin-inducible degrons, the authors verified that some DDC proteins are only required for the activation (e.g., Dun1) or the maintenance (e.g., Chk1) of the DSB-dependent cell cycle arrest, while others such as Ddc2, Rad24, Rad9 or Rad53 are required for both processes. Notably, they further demonstrate that after a prolonged arrest (>24 h) in a strain carrying two DSBs, the DDC becomes dispensable and the mitotic block is then maintained by SAC proteins such as Mad1, Mad2, or the mitotic exit network (MEN) component Bub2.

      Strengths:<br /> The manuscript dissects the specific role that different components of the DDC and the SAC have during the induction of a cell cycle arrest induced by DNA damage, as well as their contribution to the short-term and long-term maintenance of a DNA DSB-induced mitotic block. Overall, the experiments are well described and properly executed, and the data in the manuscript are clearly presented. The conclusions drawn are also generally well supported by the experimental data. The observations contribute to drawing a clearer picture of the relative contribution of these factors to the maintenance of genome stability in cells exposed to permanent DNA damage.

      Weaknesses:<br /> The main weakness of the study is that it is fundamentally based only on the use of the auxin-inducible degron (AID) strategy to deplete proteins. This is a widely used method that allows a very efficient depletion of proteins. However, the drawback is that a tag is added to the protein, which can affect the functionality of the targeted protein or modify its capacity to interact with others. In fact, three of the proteins that are depleted using the AID systems are shown to be clearly hypomorphic. Verification of at least some of the results using an alternative manner to eliminate the proteins would help to strengthen the conclusions of the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript analyzes the genetic requirement for DNA damage-induced cell cycle checkpoint induction and maintenance in budding yeast bearing one or two unrepairable DNA double-strand breaks using auxin-induced degradation (AID) of key DNA damage response (DDR) factors. The study paid particular attention to solving a puzzle regarding how yeast bearing two unrepaired DNA breaks fail to engage in "adaptation" whereas those with a single unrepairable break eventually resume cell cycling after a prolonged (up to 12 h) G2 arrest.

      The most novel findings are: 1. The genetic requirement for the entry to DDC and the maintenance are separable. For instance, Dun1 is partially required for the entry but not DDC maintenance whereas Chk1 is only required for maintenance. 2. Cells with two irreparable breaks respond to DDR only up to a certain time (~12 h post damage) and beyond this point, depend on spindle assembly checkpoint (SAC) and mitotic exit network (MEN) to halt cell cycling. 3. The authors also propose an interesting model that the location of DNA breaks and their distance to centromeres can lead to the triggering of SAC/MEN and dictate the duration of cell cycle arrest and their adaptability following DNA damage. The results thus provide the most compelling evidence on the role of SAC/MEN in DNA damage response and cell cycle arrest albeit its impact might be limited to the current experimental set-up or under conditions when DNA repair is severely deficient.

      Overall, the conclusion of the study is well supported by the elegant set of genetic experimental data and employed multiple readouts on DDC factor depletion on checkpoint integrity and cell cycle status. However, the study still relies heavily on Rad53 phosphorylation as the primary metric to assess checkpoint status. Since evidence exists the residual DDC still operates even when Rad53 phosphorylation is undetectable, additional readouts for DDC functions might be necessary to strengthen the study's conclusions. These and other concerns that need clarifications or further experimental validations are discussed below.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The DNA damage checkpoint (DDC) inhibits the metaphase-anaphase transition to repair various types of DNA damage, including DNA double strand breaks (DSBs). One irreparable DSB can maintain the DDC for 12-15 hours in yeast, after which the cells resume the cell cycle. If there are two DSBs, the DDC is maintained for at least 24 hours. In this study, the authors take advantage of this tighter DDC to investigate whether the best-known proteins involved in establishing the DDC are also responsible for its long-term maintenance during irreparable DSBs. They do this by cleverly degrading such proteins after DSB formation. They show that most, but not all, DDC proteins maintain the cell cycle block. Interestingly, DDC proteins become dispensable after 15 hours and the block is then maintained by spindle assembly checkpoint (SAC) proteins.

      Strengths:<br /> The authors have engineered a tight yeast system to study DDC shutdown after irreparable DSBs and used it to address whether checkpoint proteins (DDC and SAC) contribute to the long-term maintenance of DSB-mediated G2/M block. The different roles of Ddc2, Chk1, and Dun1 are interesting, while the fact that SAC overtakes DDC after 15 hours is intriguing and highlights how DSBs near and far from centromeres can have a profound impact on cell adaptation to DSBs.

      Weaknesses:<br /> Some of the results they present essentially confirm their own previous findings, albeit with a tighter strain design for long-term arrest. In addition, some conclusions about the role of specific DDC proteins in cell cycle arrest at G2/M need further experimental support. The results with Bfa1/Bub2 are surprising and somewhat unexpected. There is no clear mechanism for how depletion of Bub2, but not Bfa1, can relieve the G2/M (metaphase) block.

    1. eLife assessment

      This important study elucidates the function of the cohesin subunit SCC3 in impeding DNA repair between inter-sister chromatids in rice. The observation of sterility in the SCC3 weak mutant prompted an investigation of abnormal chromosome behavior during anaphase I through karyotype analysis. While the evidence presented is largely solid, the strength of support can be substantially improved in some aspects, leaving room for further investigation. This research contributes to our understanding of meiosis in rice and attracts cell biologists, reproductive biologists, and plant geneticists.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript describes the identification and characterization of rice SCC3, including the generation and characterization of plants containing apparently lethal null mutations in SCC3 as well as mutant plants containing a c-terminal frame-shift mutation. The weak scc3 mutants showed both vegetative and reproductive defects. Specifically, mitotic chromosomes appeared to partially separate during prometaphase, while meiotic chromosomes were diffuse during early meiosis and showed alterations in sister chromatid cohesion, homologous chromosome pairing, and recombination. The authors suggest that SCC3 acts as a cohesin subunit in mitosis and meiosis, but also plays more functions other than just cohesion.

      Strengths:<br /> The manuscript contains a large amount of generally high-quality data.

      Weaknesses:<br /> Several of the conclusions drawn in the manuscript are not supported by the data. There are many examples where the authors either draw conclusions or make statements that are just not justified based on the data presented or present a conclusion as a new finding, which has already been demonstrated in the past by others. For example, they claim that SCC3 functions in the maintenance of replication. From my reading of the manuscript, nowhere did the authors examine DNA replication. Likewise, several of the conclusions drawn are in direct contrast with what is known about SCC3 in other organisms. Therefore, the conclusions are either groundbreaking or incorrect.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript shows detailed evidence of the role of cohesin regulators in rice meiosis and mitosis.

      Strengths:<br /> There is a very clear mechanism for its role during replication. The strength of the evidence and its novelty is very high. This paper makes a significant contribution to the body of knowledge on meiotic cohesion in a valuable plant model.

      Weaknesses:<br /> The authors did not consider creating heterozygous mutants for the replication fork.<br /> Moderate English language editing may be required.

    4. Reviewer #3 (Public Review):

      Summary:<br /> Prior research on SCC3, a cohesin subunit protein, in yeast and Arabidopsis has underscored its vital role in cell division. This study investigated into the specific functions of SCC3 in rice mitosis and meiosis. In a weakened SCC3 mutant, sister chromatids separating was observed in anaphase I, resulting in 24 univalents and subsequent sterility. The authors meticulously documented SCC3's loading and degradation dynamics on chromosomes, noting its impact on DNA replication. Despite the loss of homologous chromosome pairing and synapsis in the mutant, chromosomes retained double-strand breaks without fragmenting. Consequently, the authors inferred that in the scc3 mutant, DNA repair more frequently relies on sister chromatids as templates compared to the wild type.

      Strengths:<br /> The study presents exceptionally well-executed research in the field of rice cytogenetics.

      Weaknesses:<br /> While the paper's conclusions are generally well-supported, further substantiation is needed for the claim that SCC3 inhibits template choice for sister chromatids. To bolster this conclusion, I recommend that the authors perform whole-genome sequencing on parental and F1 individuals from two rice variants, subsequently calculating the allele frequencies at heterozygous sites in the F1 individuals. If SCC3 indeed inhibits inter-sister chromatid repair in the wild type, we would anticipate a higher frequency of inter-homologous chromosome repair (i.e., gene conversion). This should be manifested as a bias away from the Mendelian inheritance ratio (50:50) in the offspring of the wild type compared to the offspring of the scc3+/- mutant.

    1. eLife assessment

      This study focuses on nuclear pore complex dysfunction in a mouse model of Alzheimer's disease related Aβ pathology. The findings are useful in supporting the idea that nuclear cytoplasmic transport defects occur prior to plaque deposition in this disease model and may be caused by Alzheimer's disease pathology. However, the work suffers from overinterpretation of some of the data and remains incomplete in several respects; 1) molecular mechanisms that drive nuclear pore complex dysfunction are not explored, 2) evidence that time-dependent loss of the nuclear pore complex is linked to normal aging is lacking, and 3) a clear description of how the observations reported in this work fit into broader views in the field surrounding amyloid aggregation and accumulation in neurons and pathogenesis in neurons needs to be clarified.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript describes a deficiency in nuclear pore complexes (NPCs) to maintain proper compartmentalization between the nucleus and cytoplasm in a mouse model of AD-related Aβ pathology. Experiments demonstrate NPC dysfunction in cultured neurons and mouse tissue as a result of intracellular Aβ, which may cause reduced levels of certain nucleoporins, leading to a reduced number of NPCs, and their dysfunction in nuclear protein import and maintaining nucleocytoplasmic compartmentalization. In addition, the authors also report a potential mechanism for how NPC dysfunction may result in increased vulnerability to inflammation-induced necroptosis, where core components are reportedly activated via phosphorylation through nucleocytoplasmic shutting. Overall, the study is interesting and well conducted and reveals striking NCT defects in a Aβ pathology disease model that may have important implications for our understanding of AD pathology.

      Strengths:<br /> Previous studies have found nucleocytoplasmic transport (NCT) defects in other models of age-related neurodegenerative diseases, including Huntington's disease, tauopathy, C9orf72-linked frontotemporal dementia / amyotrophic lateral sclerosis (FTD/ALS), and TDP-43 proteinopathy in FTD/ALS. Typically, NCT defects have been linked mechanistically to aberrant co-aggregation of nucleoporins with e.g. TDP-43 and tau found in disease models and sometimes also human autopsy tissue. This study is novel, in that it describes NCT defects that are caused by Alzheimer's disease (AD) related Aβ pathology, using a human APP knock-in mouse model (AppNL-G-F/NL-G-F) that exhibits robust Aβ pathology in the CNS. The main focus of this study is on the barrier dysfunction of the NPCs leading to compartmentalization defects, while previous publications in the field have focused more on active protein import and RNA export defects. This is of considerable interest since an age-dependent decline in NPC barrier function has been observed in transdifferentiated neurons derived from normal-aged fibroblasts (Mertens et al., 2015). The potential link of NPC dysfunction to an increased vulnerability to inflammation-induced necroptosis may also be relevant to other neurodegenerative disorders with NCT dysfunction. Experiments are largely focused on either dissociated neuronal cultures, or studies using mouse tissue at different stages of disease progression. Experiments are mostly based on immunocytochemistry (ICC) and histochemistry (IHC) of nucleoporins to show morphological NPC defects and fluorescent reporter constructs and dyes of defined MW to show NPC dysfunction. The experiments using an anti-nuclear pore O-linked glycoprotein antibody [RL1], which recognizes multiple metazoan nucleoporins that are modified via post-translational O-GlcNAcylation, show a very striking reduction in staining intensity that is also replicated with antibodies specific for the FG-motif rich Nup98 and the very stable and essential NPC component Nup107. Taken together, the fluorescence microscopy studies convincingly support the claim of NPC dysfunction leading to defective compartmentalization between the nucleus and cytoplasm.

      Weaknesses:<br /> However, the molecular mechanisms leading to NPC dysfunction and the cellular consequences of resulting compartmentalization defects are not as thoroughly explored. Results from complementary key experiments using western blot analysis are less impressive than microscopy data and do not show the same level of reduction. The antibodies recognizing multiple nucleoporins (RL1 and Mab414) could have been used to identify specific nucleoporins that are most affected, while the selection of Nup98 and Nup107 is not well explained. There is also no clear hypothesis on how Aβ pathology may affect nucleoporin levels and NPC function. All functional NCT experiments are based on reporters or dyes, although one would expect widespread mislocalization of endogenous proteins, likely affecting many cellular pathways. The second part of this manuscript reports that in App KI neurons, disruption in the permeability barrier and nucleocytoplasmic transport may enhance activation of key components of the necrosome complex that include receptor-interacting kinase 3 (RIPK3) and mixed lineage kinase domain1 like (MLKL) protein, resulting in an increase in TNFα-induced necroptosis. While this is of potential interest, it is not well integrated in the study. This potential disease pathway is not shown in the very simple schematic (Fig. 8) and is barely mentioned in the Discussion section, although it would deserve a more thorough examination.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors try to establish that there is an Abeta-dependent loss of nuclear pores early in Alzheimer's disease. To do so the authors compared different NUP proteins and assessed their function by analyzing nuclear leakage and resistance to induction of nuclear damage and the associated necroptosis. The authors use a mouse knockin for hAPP with familial Alzheimer's mutations to model amyloidosis related to Alzheimer's disease. Treatment with an inhibitor of beta-amyloid production partially rescued the loss of nuclear pore proteins in young KI neurons, implicating beta-amyloid in Nuclear Pore dysfunction, a mechanism already described in other neurodegenerative diseases but not in Alzheimer's disease.

      The conclusions of this paper related to familial AD are well supported by data but are not related to an aging decline in NUP function, where it is required to extend data analysis and one additional experiment.

      1. Adding statistics and comparisons between wild-type changes at different times/ages to determine if the nuclear pore changes with time in wild-type neurons. The images show differences in the Nuclear pore in neurons from the wild-type mice, with time in culture and age. However, a rigorous statistical analysis is lacking to address the impact of age/development on NUP function. Although the authors state that nuclear pore transport is reported to be altered in normal brain aging, the authors either did not design their experiments to account for the normal aging mechanisms or overlooked the analysis of their data in this light.

      2. Add experiments to assess the contribution of wild-type beta-amyloid accumulation with aging. It was described in 2012 (Guix FX, Wahle T, Vennekens K, Snellinx A, Chávez-Gutiérrez L, Ill-Raga G, Ramos-Fernandez E, Guardia-Laguarta C, Lleó A, Arimon M, Berezovska O, Muñoz FJ, Dotti CG, De Strooper B. 2012. Modification of γ-secretase by nitrosative stress links neuronal ageing to sporadic Alzheimer's disease. EMBO Mol Med 4:660-673, doi:10.1002/emmm.201200243) and 2021 (Burrinha T, Martinsson I, Gomes R, Terrasso AP, Gouras GK, Almeida CG. 2021. Upregulation of APP endocytosis by neuronal aging drives amyloid-dependent synapse loss. J Cell Sci 134. doi:10.1242/jcs.255752), 28 DIV neurons are senescent and accumulate beta-amyloid42. In addition, beta-amyloid 42 accumulates normally in the human brain (Baker-Nigh A, Vahedi S, Davis EG, Weintraub S, Bigio EH, Klein WL, Geula C. 2015. Neuronal amyloid-β accumulation within cholinergic basal forebrain in ageing and Alzheimer's disease. Brain 138:1722-1737. doi:10.1093/brain/awv024), thus, it would be important to determine if it contributes to NUP dysfunction. Unfortunately, the authors tested the Abeta contribution at div14 when wild-type Abeta accumulation was undetected. It would enrich the paper and allow the authors to conclude about normal aging if additional experiments were performed, namely, treating 28Div neurons with DAPT and assessing if NUP is restored.

    4. Reviewer #3 (Public Review):

      Summary:<br /> This manuscript reports the novel observation of alterations in the nuclear pore (NUP) components and the function of the nuclear envelope in knock-in models of APP and presenilin mutations. The data show that loss of NUP immunoreactivity (IR) and pore density are observed at times prior to plaque deposition in this model. The loss of NUP IR is correlated with an increase in intraneuronal Abeta IR with two monoclonal antibodies that react with the N-terminus of Abeta. Similar results are observed in cultured neurons from APP-KI and Wt mice where further results with cultured neurons indicate that Abeta "drives" this process: incubation of neurons with oligomeric, but not monomeric or fibrillar Abeta causes loss of NUP IR, incubation with conditioned media from KI cells but not wt cells also causes loss of NUP IR and treatment with the gamma secretase inhibitor, NAPT partially blocks the loss of NUP IR. Further data show that nuclear envelope function is altered in KI cells and KI cells are more sensitive to TNFalpha-induced necroptosis. This is potentially an important and significant report, but how this fits within the larger picture of what is known about amyloid aggregation and accumulation and pathogenesis in neurons needs to be clarified. The results from mouse brains are strong, while the results from cultured cells are in some instances are of a lower magnitude, less convincing, ambiguous, and sometimes over-interpreted.

      Strengths:<br /> 1. Loss of NUP expression and activity is a novel observation.<br /> 2. Its association with intraneuronal Abeta immunoreactivity suggests an association with Alzheimer's disease.<br /> 3. The experiments generally appear to be well-controlled.<br /> 4. Multiple approaches are sometimes used to increase the robustness of the data.

      Weaknesses:<br /> 1. It does not consider the relationship of the findings here to other published work on the intraneuronal perinuclear and nuclear accumulation of amyloid in other transgenic mouse models and in humans.<br /> 2. It appears to presume that soluble, secreted Abeta is responsible for the effect rather than the insoluble amyloid fibrils.<br /> 3. Most of the critical findings on the association with Abeta and the functional consequences are done in cultured neurons and not in mouse models.<br /> 4. There is no evidence from the human brain that would strengthen the significance.<br /> 5. It is not clear when the alteration in NUP expression begins in the KI mice as there is no time at which there is no difference between NUP expression in KI and Wt and the earliest time shown is 2 months. If NUP expression is decreased from the earliest times at birth, then this makes the significance of the observation of the association with amyloid pathology less clear.

    1. eLife assessment

      This is a fundamental study describing a novel methylation event on EZH2 that regulates EZH2 protein stability and hematopoiesis. The methodologies are sound and the conclusions are largely supported by solid data. The work will be of interest to biomedical researchers in the field of cancer epigenetics.

    2. Reviewer #1 (Public Review):

      This study shows that SET7 and LSD1 regulate the dynamic methylation of EZH2 at K20, which is recognized by L3MBTL3 promoting protein degradation via the DCAF5-CRL4 E3 ubiquitin ligase. K20 methylation negatively regulates S21 phosphorylation and vice versa, modulating EZH2 functions. Mice harboring the K20 methylation-deficient mutant (K20R) exhibit hematopoietic defects. Overall, this is an interesting study elucidating a novel mechanism of EZH2 regulation. The methodologies are sound and the conclusions are largely supported by the data provided. However, there are some questions regarding the overall model and some contradictory results.

    3. Reviewer #2 (Public Review):

      EZH2 is upregulated in most advanced cancers and has been investigated as a therapeutic target for many years. However, how EZH2 activity is regulated remains to be fully elucidated. In this study, Guo et al. provided a new mechanism for the regulation of EZH2. The authors demonstrated that the protein stability of EZH2 is dynamically regulated by lysine methylation-dependent proteolysis. Specifically, K20 of EZH2 is monomethylated by SET7 methyltransferase and demethylated by LSD1 demethylase. The methylated K20 is recognized by specific methyl-lysine reader L3MBTL3 to promote EZH2 for ubiquitin-dependent proteolysis by the CRL4DCAF5 ubiquitin E3 ligase complex, resulting in the dysregulation of EZH2/PRC2 activity and reduction of H3K27me3. The authors further found a methylation-phosphorylation switch existed in some cancer cells and this switch controls EZH2 stability and hematopoiesis.

      Overall, most conclusions of this paper are well-supported by the results presented, only some aspects of Figure 6 need to be extended. This work is of interest to biomedical researchers in the field of cancer epigenetics after minor revision.

    4. Reviewer #3 (Public Review):

      In this study, the authors demonstrated a new mechanism by which the protein stability of EZH2 is regulated. This mechanism is multifaceted and yet the authors provided evidence for every step of regulation. EZH2 is monomethylated at K20 by SET7, which can be removed by LSD1 and recognized by L3MBTL3. L3MBTL3 recruits the ubiquitin E3 ligase CRLDCAF5 to EZH2 via methylation of K20, which results in polyubiquitylation and proteasomal degradation of the histone methyltransferase. Additionally, they found that AKT-mediated phosphorylation of EZH2 at S21 blocks monomethylation at K20 and vice versa. Finally, they demonstrated in the K20R GEMM model that stabilization of EZH2 protein leads to reactive hyperplasia and hematopoiesis. In general, this study reveals an interesting and novel mechanism underlying the regulation of the epigenetic mark H3K27me3 and the oncogenic function of EZH2. The authors have considered every aspect of the signaling pathway that regulates the protein stability of EZH2. The data was comprehensive, rigorous, and supportive of the conclusions they made. Their results may help explain some of the conflicting results that previous studies have reported.

      However, there are still some issues with the significance of the work and the quality of the data. The major issues are:<br /> 1. The converged effect of EZH2 methylation and phosphorylation on H3K27me3 is unclear.<br /> 2. How the methylation-phosphorylation switch of EZH2 determines the biological phenotypes they observed is not addressed.<br /> 3. Some of the data in the manuscript is conflicting.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      In no particular order:

      1. In Figs S3 and S4, can they also show gamma fit? (or rather corrected fit accounting for abundance conditioning?) The shapes look different, especially for the microbial mat.

      Author response: We have added gamma distribution fits to the rescaled AFD plots (Figs. S3, S4).

      1. Lines 170-176 seem like they should come before lines 164-166.

      Author response: In lines 166-170 we discuss empirical patterns in the data that motivate the introduction of the SLM as a model in lines 170-175. We have clarified these points in the revision.

      1. The wiggles in the gamma predictions in the occupancy-abundance plots are because occupancy depends not only on abundance but also on the shape parameter, right? Probably good to write a sentence or two explaining what's going on here.

      Author response: We agree with the reviewer that the variation in the prediction could be in-part driven by variation in the shape parameter across community members. We now include this observation in our revision (lines 209-211).

      1. In the predicted vs observed occupancy plots, it would be nice to add curves showing predicted standard deviation or similar to give a sense of how well the model is predicting the variability.

      Author response: In the revised manuscript we now include predictions for the variance of occupancy using the gamma distribution under both taxonomic and phylogenetic coarse-graining (Fig. S9; S10; lines 211-214).

      1. Covariance between sister groups: Figs S9 and S10 look very nice, but it's hard to see much because they're log-log plots over multiple decades, while even a several-fold difference from y = x would indicate a strong effect of correlations. It would be clearer if the y-axis showed the ratio of the coarsegrained variance to the sum of OTU variances and we were looking at how well it fit y = 1.

      Author response: We have included these plots in the revision (Fig. S14, S15).

      1. If the sum of gammas can be well-approximated by a gamma, does that mean that the gamma is just a fairly flexible distribution and we shouldn't take the quality of the gamma fits in general as a very specific indication of what's going on?

      Author response: While the sum of random variables that are drawn from gamma distributions with different parameters is often well-approximated by another gamma, this does not tell us why the gamma distribution holds for microbial communities at the finest-grain level (i.e., OTUs/ASVs). At present, the best explanation is that the gamma is a stationary distribution for certain stochastic differential equations which have ecological interpretations (Grilli, 2020; Shoemaker et al., 2023). Furthermore, alternative two-parameter distributions have been tested alongside the gamma and have done a comparatively poor job capturing observed macroecological patterns (Grilli, 2020). These results suggest that the utility of the gamma distribution is not simply an outcome of its flexible nature, it succeeds because it has captured core ecological properties of microbial communities. In the case of the SLM, gamma-like distributions arise when a community member is subject to self-limiting growth and environmental noise. On the other hand, the stability of the gamma distribution might explain why it can be detected as shape of the AFD, as it does not fade out across coarse-graining level.

      1. What's going on with the variance of diversity in Fig S12? Does this suggest that some of the problem in Figure 4 could be with the analytic approximation rather than the model? I had a hard time understanding the part of the Methods explaining the simulation details (lines 587-597). It would be worth expanding this. Is there some way to explain how the correlations were simulated in terms of the SLM, e.g., correlations in the noise term across OTUs?

      Author response: We believe that deviations in the variance of diversity in Fig. S16g,h are driven by small deviations in our predictions of the second moment $$< (x*ln(x) | N_{m}, \bar{x}{i}, \beta{i}^{2} >$$ (Eq. S16). Alone these predictions are slight, but their effects become noticeable when summed over hundreds or thousands of taxa. We have included this observation in the revised manuscript (lines 268-271). However, this deviation pales in comparison with the magnitude of covariance in the empirical data, suggesting that our inability to predict the variance of richness and diversity is primarily driven by our assumption of statistical independence.

      Regarding the source of the correlations, under the SLM correlations in abundances can be introduced either by adding deterministic interaction terms or through correlated environmental noise. Determining which of these two options drives empirical correlations is an active area of research (e.g., Camacho-Mateu et al., 2023). For the purpose of this study, we remain agnostic on the cause of the correlations, optioning to instead emphasize that that the inclusion of correlations is necessary to reproduce observed slopes of the fine vs. coarse-grained relationship for diversity.

      1. In Figure 5ab, is the idea that the correlation in richness is primarily driven by the number of samples from the environment? Line 390 seems to say so, but it would be good to make this explicit and put it right in that section of the Results.

      Author response: Our results suggest that sampling effort (# reads) plays a larger role in determining the correlations between fine and coarse-grained measures of richness. We now clarify this point in the revised manuscript (lines 429-435).

      1. I don't totally understand the contrast in lines 369-372. If fine-scale diversity within one group begets coarse-grained diversity in another group, couldn't that show up as correlations in the AFDs? Or is the argument that only including within-group correlations in AFDs is enough to reproduce the pattern? I'm not sure I see how that could be.

      Author response: The term “begets” implies both causation and direction. If we see a positive relationship between diversity estimates at two different scales of observation the causal mechanism cannot be determined solely from correlations between samples obtained once from different sites. So, mechanisms consistent with niche construction/"DBD" can produce correlations, though the existence of correlations do not necessarily imply DBD.

      1. The discussion of niche construction on 429-431 doesn't match very well with 440-441. Basically, niche construction is a very broad concept, not a specific one, right?

      Author response: In lines 472-576 (formerly 429-431) we discuss how the existence of correlations between fine and coarse-grained scales does not point to a single ecological mechanism. Alternatively stated, observing a non-zero slope does not mean that niche construction is driving the relationship.

      In lines 476-487 (formerly 440-441) we discuss how the mechanism of cross-feeding has been shown to generate a positive relationship between fine and coarse-grained measures of diversity. This mechanism can be interpreted as a form of “niche construction”, so it is an instance of a tested ecological mechanism that aligns with the interpretation given in Madi et al. (2020).

      1. Isn't (8) just the negative binomial distribution?

      Author response: The convolution of the stationary solution of the SLM (i.e., a gamma distribution) and the Poisson limit of a multinomial sampling distribution returns a negative binomial distribution of read counts across hosts if samples have identical sampling depths. We now include this detail in the revision (line 593-595). Note however that if different samples have different sampling depths, the distribution of reads across samples is not a negative binomial.

      1. Missing 1/M in (9).

      Author response: We have fixed this omission in the revision.

      1. Schematic figures illustrating what the different statistics are intuitively capturing would really help this work be understandable to a broader audience, but they'd also be a ton of work.

      Author response: Richness and diversity are used in ecology to such an extent that we do not see the benefit of a conceptual diagram. Furthermore, we have included a conceptual diagram about our pipeline in our revision at the request of Reviewer 2 (Fig. S20).

      Reviewer #2:

      Major Recommendations

      If I were reviewing this manuscript for a regular journal, I believe the following issues would be important to address prior to publication.

      1. From my reading, the main points of this advance are that

      a. SLM models AFDs well at all levels of coarse-graining.

      b. This makes SLM a better null-model than UNTB for macroecological relationships.

      c. Using SLM on the EMP data, the richness slopes are well explained by SLM but not the diversity slopes. Therefore, any theory that hopes to explain the diversity slopes must include interactions. Argument B appears to be one of the key points yet is missing from the abstract, and should be made clearer. If these aren't the main points the authors intended, then other main points need to be highlighted more.

      Author response: In the revision we now explicitly mention argument b in the Abstract.

      1. The title should be more specific, so as to better reflect the content. (E.g. "UNTB is not a good null model for macroecological patterns" would seem more appropriate.)

      Author response: We would prefer to focus on the success of the SLM rather than the limitations of the UNTB in the title of this work. Therefore, we have modified our title as follows: “Investigating macroecological patterns in coarse-grained microbial communities using the stochastic logistic model of growth”.

      1. The manuscript would benefit from a clearer description of exactly what information the SLM retains about the data (perhaps even a cartoon panel in one of the figures). In particular, it is important to be explicit about the number of model parameters.

      Author response: The number of model parameters for the gamma AFD are now explicitly stated in the revision (Lines 579-580).

      1. The main point of Figures 2-4 seems to be that SLM is good at describing the data (and when it fails it is due to interactions) while UNTB fails to reproduce this behavior, in support of Argument B. This is not clear from the figure descriptions or titles, which focus on SLM's "predictive" power.

      Author response: Fig. 2a demonstrates that the gamma distribution predicted by the SLM explains the empirical distribution of abundances. This result provides motivation to predict the fraction of sites harboring a given community member (i.e., occupancy, Fig. 2c) as well as general measures of community composition including mean richness (Fig. 3a,c) and mean diversity (Fig. 3b,d) using parameters estimated from the data (not free parameters).

      This success led us to consider whether the gamma distribution could predict the variance of richness and diversity, which it could not because it does not capture covariance between community members (Fig. 4).

      In the revision we have identified opportunities to make these points clear throughout the Results. Furthermore, we have added additional detail to the legends of Figs. 2-4.

      1. The manuscript would benefit from clarifying the use of "prediction" related to the SLM. Since the gamma distributions predicted by SLM were fit to empirical data, it seems like the agreement between analytic means and empirical means (Fig. 3) is a statement on gamma distributions being a good fit for the AFD's more than SLM predicting richness and diversity. For example, from my reading, it seems like this analysis could be done numerically by shuffling species abundances across environments and seeing whether this changed the mean richness/diversity. I would not call this shuffling test a prediction, since it is more a statement on the relevance of interactions. SLM predicts gamma-distributed AFD's, but those distributions recovering the data they were trained on doesn't seem like a prediction.

      Author response: In this manuscript we identified the gamma distribution as an appropriate probability distribution to describe the distribution of relative abundances across samples over a range of coarse-grained scales. Motivated by this result, we performed a separate analysis where at each scale we estimated the mean and variance of relative abundance across sites for each community member. We then used these parameters to obtain the expected value of a community-level measure using an equation we derived by assuming that the gamma distribution was appropriate (e.g., richness, Eq. 13). We then compared the expected value of richness to the mean value from empirical data and assessed the similarity between the two values.

      The outcome of this procedure constitutes a prediction. While the mean and variance are parameters, estimating them from the empirical data has no connection with the operation of training a distribution on empirical data. We could have derived predictions such as Eq. 13 using any other probability distribution that can be parameterized using the mean and variance (e.g., Gaussian). Such a prediction would likely do a poor job even though it used the same means and variances used for our gamma predictions. This is because the choice of distribution would not have been a good descriptor of the distribution of abundances across hosts.

      To better explain this last -- perhaps the most significant -- issue, I'd like to ask the authors if the following recasting would be an accurate reflection of their conclusions, or if something is missing.

      1. "Focusing on the empirical relationship observed between diversity slopes by Madi 2020, we ask the question: does explaining these relationships require accounting for species-species correlations? Or could it be reproduced in a noninteracting model?" To address this question, one can perform a randomization test, shuffling abundances to preserve all single-OTU statistics but breaking any correlations. My reading of the authors' results is that (new result 1) the richness relationships would be preserved, while diversity relationships would not be preserved. [Note that this result 1 need not mention either SLM or UNTB.]

      Author response: The question of whether correlations between species are necessary to explain the observed slope of the fine vs. coarse-grained relationship was only one component of our research goals. Our first question was whether the SLM would prove to be a more appropriate null for evaluating the novelty of observed slopes. We believe that our results support the conclusion that the SLM is an appropriate null for this question, as it was able to capture observed slopes of the fine vs. coarse-grained relationship for estimates of richness, determining that correlations and the interactions that are ultimately responsible are not necessary to explain this result.

      We then find that the SLM as a null model fails to capture observed slopes of the fine vs. coarsegrained relationship for estimates of diversity and simulate the SLM with correlations to return reasonable estimates of the slope. However, here the question about correlations is a direct follow-up from our question about a null model that excludes interactions, so it is unclear how a randomization test would relate to this result.

      1. Instead of doing a randomization test (resampling the empirical distribution), one might insist on instead fitting a model to the AFD distributions, and sampling from that distribution rather than the empirical one.

      a. If doing it this way, one should of course ensure that the distribution being fit is a good description of the data.

      b. UNTB is a bad fit. SLM is a better fit, and in fact (new result 2) continues to be a good empirical fit even at coarse-grained levels.

      c. Can make statements on using SLM as a null model for these types of cross-scale relationships. Could try arguing that fitting an SLM model per-OTU (instead of resampling the empirical distribution) could offer some advantage if certain properties could be computed analytically from the fit parameters, instead of averaging over multiple computational rounds of resampling.

      Do these two points accurately summarize the manuscript? If so, this presentation avoids the confusion with "prediction". If my summary is missing some important point, the presentation should be revised to clarify the points I appear to have missed.

      Author response: In our manuscript we derive predictions from the gamma distribution, the stationary distribution of the SLM, that require parameters estimated from the data (i.e., mean and variance of relative abundance). These parameters are estimated from the data using normal procedures and then plugged into our predictions that assume the appropriateness of the gamma, returning values that are then compared to estimates from empirical data. Our estimation of the mean and variance does not assume that the empirical distribution following a gamma distribution, but the value returned by our function derived from the gamma distribution (e.g., Eq. 13) does make that assumption.

      To address the reviewer’s broader comment, we believe that following points summarize our manuscript:

      1. The gamma distribution as a stationary solution of the SLM captures macroecological patterns and predicts typical community-level properties (i.e., mean richness and diversity) across phylogenetic and taxonomic scales.

      2. The gamma distribution fails to predict variation in community-level properties (i.e., variance of richness and diversity) across phylogenetic and taxonomic scales. This occurs because the SLM is a mean-field model that does not explicitly include interactions between community members.

      3. Despite the inability to capture interactions, the gamma distribution succeeds at predicting the fine vs. coarse-grain slope for richness, a pattern that had previously been attributed to community member interactions. This result demonstrates that the novelty of a macroecological pattern hinges on one’s choice of null model.

      4. However, the gamma cannot capture the same relationship for diversity. Simulations of the gamma distribution that incorporate correlations between community members are capable of generating reasonable estimates of the slope.

      To address the reviewer’s comments regarding the appropriateness fitted gamma distributions, in our revision we have added fitted gamma distributions to plots of AFDs so that the reader can visually assess the ability of the gamma to describe empirical patterns (Fig. S3, S4).

      We have also obtained predictions for the slope of the fine vs. coarse-grained relationship for community richness using the same form of UNTB used by Madi et al (2020). In our revised manuscript we establish a procedure to infer the single parameter of this model, generate predictions of richness at fine and coarse-grained scales, and then evaluate whether the UNTB is capable of predicting the slope of the fine vs. coarse-grained relationship for richness (Supplementary Information; Figs. S18, 24-28; lines 277-278; 370-380).

      Other/minor comments

      1. The manuscript would be improved with more consistent terminology ("fine vs. coarse-grained relationship"/"the relationship" vs. "diversity slope"). Also, many readers may be used to OTUs referring to the rather fine level of description, as opposed to any chosen level; and could interpret indexing over groups as being in contrast with indexing over OTU's (coarse vs fine). The authors' use is perfectly correct, but keeping a consistent terminology would help.)

      Author response: We have revised our manuscript to specify the “slope” as the “slope of the fine vs. coarse-grained relationship” (e.g., Line 318). We also specify in the Results and in the Methods that we use “fine” and “coarse” as relative terms, keeping with the sliding-scale approach used in Madi et al (2020).

      1. While I appreciate this "slope" is something borrowed from other work, the clarity of the paper might benefit from a cartoon of how one goes from the raw data to the slopes at a particular coarse-graining level. (Optional).

      Author response: We had added a conceptual diagram to the revision (Fig. S20).

      1. The text often colloquially references "the gamma," "predictions of the gamma," etc. This phrasing comes across as sloppy, and the manuscript would be improved by being more specific.

      Author response: We now specify “gamma” as the “gamma distribution” throughout the manuscript.

      1. Equation 6 appears to be missing some subscripts on the x terms (included on the left of the equation).

      Author response: We thank the reviewer for noticing this error and we have corrected it in the revision.

      1. In "Simulating communities of correlated...AFDs", the acronym SAD is not defined.

      Author response: We thank the reviewer for noticing this error and we have corrected it in the revision.

      1. In Figure 2:

      a. Invariant is probably the wrong word for the title, since all the AFD's were rescaled by mean and variance before being compared. Data does support that the gamma distributions are good at describing the AFD's, but as stated in the description it's the general shape that is preserved, not the distribution itself.

      Author response: When we mention the invariance of the AFD we now specify that we mean that the shape of the distribution remained qualitatively invariant.

      b. I'd recommend changing the color coding to something with more contrast, since currently it's impossible to assess the claim that the shape of the distribution collapses.

      Author response: Our coarse-graining procedure is a sequential operation that has no intuitive point that would suggest the use of a contrasting colormap (e.g., if our scale ranged from -1 to 1 then there would be a natural point of contrast at zero).

      c. The legend is missing relevant technical details: How many OTU's were used to make plot a? How many samples?

      Author response: The number of samples was listed in the Materials and Methods (line 523). In the revision we now include a table with the average and total number of OTUs as well as the average number of reads for each environment (Table S1, S2).

      d. In plot b, is the mean relative abundance referring to "mean abundance when observed" or "mean across all samples"?

      Author response: The mean relative abundance is the mean abundance across all sites (line 204) and in the legend of Fig. 2.

      e. Since one argument here is that SLM fits these distributions better than UNTB, if possible it would be nice to see UNTB's failed fits here.

      Author response: A major feature of the UNTB is that the demographic parameters of community members are indistinguishable. Under the SLM, the variation in the mean relative abundance we observe suggests that the carrying capacities of community members vary over multiple orders of magnitude, a result that is incompatible with most forms of the UNTB (x-axis of Fig. 2b). We now mention this point in the revised manuscript (lines 110; 229; 455-471).

      1. In Figure 3:

      a. It is not clear how coarse-graining is included in model fitting. The "Deriving biodiversity measure predictions" section would benefit from including how coarse-graining is incorporated.

      Author response: We predict measures of biodiversity separately at each coarse-grained scale. We now clarify this detail in the revised manuscript (Lines 624-627).

      b. Reference Shannon Diversity in Methods.

      Author response: We now cite Shannon’s diversity.

      c. What is the blue/white color coding in plots a & c? It doesn't have any color key.

      Author response: Figs. 3-6 use a uniform light-to-dark scale for all environments, with each environment having its own color. For example, Fig. 3a contains data from the human gut microbiome. Human gut data were assigned the color aquamarine, so the shade of aquamarine for a given datapoint in Fig. 3a indicates the phylogenetic scale.

      In the revision we now clarify the colorscale in the legend of Fig. 3 and specify that the same scale is used in all subsequent figure legends.

      d. Re: earlier comments, why is richness considered a prediction? (Am I correct in my interpretation that panel b is almost a tautology - counting the number of zeros in the matrix either by rows or by columns - whereas panel d is nontrivial?)

      Author response: Mean richness as a measure of biodiversity depends on the fraction of sites where a given community member is present (i.e., occupancy). The mean relative abundance of a community member and its variation across sites (beta) is clearly related to occupancy, but those two statistics do not give you a prediction of occupancy. Obtaining a prediction of occupancy and, subsequently, richness, requires 1) a probability distribution of abundances (i.e., the gamma) and 2) a probability distribution of sampling (i.e., the Poisson). Using these two pieces of information, we derived a prediction for mean richness (Eq. 13). We then compare the value of richness obtained by plugging in the mean relative abundances, betas, and known number of reads to the observed mean richness obtained from the data.

      e. The lettering of subplots in Figure 3 is not consistent with Figure 4. Figure 3 subplots are also cited incorrectly in paragraph two on page six (lines 251-254).

      Author response: We thank the reviewer for noticing the error and we have corrected it in the revision.

      f. Again, if possible show UNTB predictions in plots a & c.

      Author response: In our revised manuscript we provide extensive descriptions and predictions of mean richness and the slope of the fine vs. coarse-grained relationship for richness using the form of the UNTB used in Madi et al. (2020; Figs. S18, S24 - S29; lines 277-282; 370-380). We then compare the error of these slope predictions to those obtained from the SLM, finding that the SLM generally outperforms UNTB (Figs. S27-S29).

      1. In Figure 4:

      a. What are the color codings in plots a & b?

      Author response: The color scale used in Fig. 4 is identical to the color scale used in Fig. 3. This detail is now specified in the legend of Fig. 4.

      b. What are the two lines of empirical data in plots a & b, and why is one of them dashed?

      Author response: We now specify what the two lines mean in the key within the figure.

      c. Same comment as earlier on predictions and richness.

      Author response: We now specify what the two lines mean in the key within the figure.

      1. In Figure 5:

      a. It wasn't clear to me in the manuscript how the authors generated these plots from the raw data. The manuscript would benefit from a clear cartoon/description of the data pipeline, from raw data to empirical (and analytic) slopes.

      Author response: We have added a conceptual diagram to the revised manuscript (Fig. S20).

      b. Make the figure title more descriptive to better connect it to the figure's objective (the richness slopes relationship is not novel, but the diversity slopes relationship is).

      Author response: We have revised the figure title.

      References

      Camacho-Mateu, J., Lampo, A., Sireci, M., Muñoz, M. Á., & Cuesta, J. A. (2023). Species interactions reproduce abundance correlations patterns in microbial communities (arXiv:2305.19154). arXiv. https://doi.org/10.48550/arXiv.2305.19154

      Grilli, J. (2020). Macroecological laws describe variation and diversity in microbial communities. Nature Communications, 11(1), 4743. https://doi.org/10.1038/s41467-020- 18529-y

      Madi, N., Vos, M., Murall, C. L., Legendre, P., & Shapiro, B. J. (2020). Does diversity beget diversity in microbiomes? eLife, 9, e58999. https://doi.org/10.7554/eLife.58999

      Shoemaker, W. R., Sánchez, Á., & Grilli, J. (2023). Macroecological laws in experimental microbial systems (p. 2023.07.24.550281). bioRxiv. https://doi.org/10.1101/2023.07.24.550281

    1. eLife assessment

      This valuable paper addresses a notable problem, the cell biological control of biomineralization, with the sea urchin embryo as an experimental model. The paper provides evidence that ROCK and the cytoskeleton play a role in biomineralization, but the evidence is deemed currently incomplete, as there are concerns regarding the efficacy and specificity of the reagents used to perturb ROCK function. In addition, the data do not point to a plausible mechanism by which the actin cytoskeleton might regulate the biomineralization process.

    1. eLife assessment

      This is a fundamental work that significantly advances our understanding of the role of mossy cells in the dentate gyrus in Fragile X Syndrome. The carefully designed and executed extensive series of experiments provide compelling evidence that changes in their excitability occur due to up-regulation of Kv7 currents. The study unveils the underlying mechanisms of the disease, and therefore the work will be of interest to neuroscientists working on various aspects of Fragile X pathology. In addition, it also provides insights into how neuronal activity is balanced in networks through diverse cellular mechanisms.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thorough assessment of our study, their overall enthusiasm, and the helpful suggestions for clarifying the methods and results, additional analyses, and discussion points. We have made earnest efforts to address the weaknesses raised in the public review and other recommendations made by the reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Herein, Blaeser et al. explored the impact of migraine-related cortical spreading depression (CSD) on the calcium dynamics of meningeal afferents that are considered the putative source of migraine-related pain. Critically previous studies have identified widespread activation of these meningeal afferents following CSD; however, most studies of this kind have been performed in anesthetized rodents. By conducting a series of technically challenging calcium imaging experiments in conscious head fixed mice they find in contrast that a much smaller proportion of meningeal afferents are persistently activated following CSD. Instead, they identify that post-CSD responses are differentially altered across a wide array of afferents, including increased and decreased responses to mechanical meningeal deformations and activation of previously non-responsive afferents following CSD. Given that migraine is characterized by worsening head pain in response to movement, the findings offer a potential mechanism that may explain this clinical phenomenon.

      Strengths:

      Using head fixed conscious mice overcomes the limitations of anesthetized preps and the potential impact of anaesthesia on meningeal afferent function which facilitated novel results when compared to previous anesthetized studies. Further, the authors used a closed cranial window preparation to maximize normal physiological states during recording, although the introduction of a needle prick to induce CSD will have generated a small opening in the cranial preparation, rendering it not fully closed as suggested.

      Weaknesses:

      Although this is a well conducted technically challenging study that has added valuable knowledge on the response of meningeal afferents the study would have benefited from the inclusion of more female mice. Migraine is a female dominant condition and an attempt to compare potential sex-differences in afferent responses would undoubtedly have improved the outcome.

      Our study included only two females, largely reflecting the much higher success rate of AAV-mediated meningeal afferent GCaMP expression in males than in females. The reason for the lower yield in female mice is unclear to us at present but may involve, at least partly, sex-specific differences in the mechanisms responsible for efficient transduction with this AAV vector observed in peripheral tissues (Davidoff et al. 2003). While our study did not address sex differences, a recent study (Melo-Carrillo et al. 2017) reported CSD equally activating and sensitizing second-order dorsal horn neurons that receive input from meningeal afferents in male and female rats.

      The authors imply that the current method shows clear differences when compared to older anaesthetized studies; however, many of these were conducted in rats and relied on recording from the trigeminal ganglion. Inclusion of a subgroup of anesthetized mice in the current preparation may have helped to answer these outstanding questions, being is this species dependent or as a result of the different technical approaches.

      We have tried to address the anesthesia issue by conducting imaging sessions in several isoflurane-anesthetized mice. However, during these experiments, we observed a substantial decrease in the GCaMP fluorescence signal with a much lower signal-to-noise ratio that made the analyses of the afferents’ calcium signal unreliable. Reduced GCaMP signal in meningeal axons during anesthesia may be related to the development of respiratory acidosis, since lower pH leads to decreased GCaMP signal, as also mentioned by Reviewer #3. Of note, urethane anesthesia, which was used in all previous rat experiments, also produces respiratory acidosis.

      The authors discuss meningeal deformations as a result of locomotion; however, despite referring to their previous work (Blaeser et al., 2022), the exact method of how these deformations were measured could be clearer. It is challenging to imaging that simple locomotion would induce such deformations and the one reference in the introduction refers to straining, such as cough that may induce intracranial hypertension, which is likely a more powerful stimulus than locomotion.

      As part of the revision, we now provide a better description of the methodology (“Image processing and calcium signal extraction” section) used to determine meningeal deformations, including scaling, shearing, and Z-shift. In our previous paper (Blaeser et al. 2023), we provided an extensive description of the types of meningeal deformations occurring in locomoting mice. It should also be noted that locomotion drives cerebral vasodilation and intracranial pressure increases (Gao and Drew, 2016), which likely mediate, at least in part, the movement of the meninges towards the skull (positive Z-shift) and potentially other meningeal deformation parameters. We also agree with the reviewer that sudden maneuvers such as coughing and sneezing that lead to a larger increase in intracranial pressure are likely to be even more powerful drivers of endogenous intracranial mechanical stimulation than locomotion. Thus, our finding of increased responsiveness to locomotion-related meningeal deformation post-CSD may underestimate the increased afferent responsivity post-CSD during other behaviors such as coughing. We added this point to the discussion.

      More recently, several groups have used optogenetic triggering of CSD to avoid opening of the cranium for needle prick. Given the authors robustly highlight the benefit of the closed cranium approach, would such an approach not have been more appropriate.

      We agree with the reviewer that optogenetic methods used for CSD induction in non-craniotomized animals will further ensure accurate pressurization and, thus, will be an even better approach that avoids the burr hole used for pinprick. It should be noted, however, that the burr hole used for the pinprick likely had a minimal effect on intracranial pressure, as we minimized depressurization by plugging the burr hole throughout the experiments with a silicone elastomer. We have added this information to the revised Methods section.

      It is also worth noting that the optogenetic methodology used by others to provoke CSD was optimized only recently and relies on transgenic mice with a strong expression of YFP (Thy1.ChR2-YFP mice) within the superficial cortex that is not compatible with the afferent GCaMP imaging of meningeal afferents. Modifications using red-shifted opsins may allow the use of this strategy in the future.

      It was not clear how deformations predictors increased independent of locomotion (Figure 4D) as locomotion is essentially causing the deformations as noted in the study. This point was not so clear to this reviewer.

      As noted in our previous paper (Blaeser et al., 2023), deformation variables often exhibit different time courses than locomotion, even when a deformation is initially induced by the onset of locomotion. Most notably, the scaling-related deformation ramps up slowly and often persists for tens of seconds after the onset and termination of locomotion, which may be related to the recovery dynamics of the meningeal vascular response to locomotion. Overall, while locomotion serves as a predictor of meningeal deformation, we observed previously (Blaeser et al. 2023) many afferents whose responses were more closely associated with the moment-to-moment deformations than with the state of locomotion per se, suggesting that a unique set of stimuli is responsible for the activation of this deformation-sensitive afferent population. The increased sensitivity to deformation signals we observed following CSD suggests that the afferent population sensitive to deformation has unique properties that render it most susceptible to becoming sensitized following CSD. We now discuss this possibility.

      Reviewer #2 (Public Review):

      This is an interesting study examining the question of whether CSD sensitizes meningeal afferent sensory neurons leading to spontaneous activity or whether CSD sensitizes these neurons to mechanical stimulation related to locomotion. Using two-photon in vivo calcium imaging based on viral expression of GCaMP6 in the TG, awake mice on a running wheel were imaged following CSD induction by cortical pinprick. The CSD wave evoked a rise in intracellular calcium in many sensory neurons during the propagation of the wave but several patterns of afferent activity developed after the CSD. The minority of recorded neurons (10%) showed spontaneous activity while slightly larger numbers (20%) showed depression of activity, the latter pattern developed earlier than the former. The vast majority of neurons (70%) were unaffected by the CSD. CSD decreased the time spent running and the numbers of bouts per minute but each bout was unaffected by CSD. There also was no influence of CSD on the parameters referred to as meningeal deformation including scale, shear, and Z-shift. Using GLM, the authors then determine that there there is an increase in locomotion/deformation-related afferent activity in 51% of neurons, a decrease in 12% of neurons, and no change in 37%. GLM coefficients were increased for deformation related activity but not locomotion related activity after CSD. There also was an increase in afferents responsive to locomotion/deformation following CSD that were previously silent. This study shows that unlike prior reports, CSD does not lead to spontaneous activity in the majority of sensory neurons but that it increases sensitivity to mechanical deformation of the meninges. This has important implications for headache disorders like migraine where CSD is thought to contribute to the pathology in unclear ways with this new study suggesting that it may lead to increased mechanical sensitivity characteristic of migraine attacks.

      1) It would be helpful to know what is meant by "post-CSD" in many of the figures where a time course is not shown. The methods indicate that 4, 30 min runs were collected after CSD but this would span 2 hours and the data do not indicate whether there are differences across time following CSD nor whether data from all 4 runs are averaged.

      While we monitored time course changes in ongoing activity (see Figure 2), it was challenging to evaluate post-CSD changes in locomotion-related deformation responses at a fine temporal scale, as running bouts resumed at different time points post-CSD and occurred intermittently throughout the post-CSD analysis period. Our experiments were also not sufficiently powered to break out analyses at multiple different epochs post-CSD, partly because there wasn’t much locomotion. To allow comparisons using a sufficient number of bouts, we conducted our GLM analyses using all data collected during running bouts in the 2-hour post-CSD period (termed “post-CSD) versus in the 1-hour pre-CSD period. We have now clarified this further in the main text and figure legends.

      2) Why is only the Z-shift data shown in Figures 4A-C? Each of the deformation values seems to contribute to the activity of neurons after CSD but only the Z-shift values are shown.

      In many afferents, only one deformation variable best predicted the activity at both the pre- and post-CSD epochs. However, at the population level, all deformation variables were equally predictive. In the examples provided, the afferent developed augmented sensitivity that could only be predicted by the Z-shift variable, and the other deformation variables were not included to keep the figure legible. This is now clarified in the figure legend.

      3) How much does the animal moving its skull against the head mount contribute to deformations of the meninges if the skull is potentially flexing during these movements? Even if mice are not locomoting, they can still attempt to move their heads thus creating pressure changes on the skull and underlying meninges. The authors mention in the methods that the strong cement used to bind the skull plates and headpost together minimize this, but how do they know it is minimized?

      We did not measure skull flexing during locomotion and its potential effect on meningeal deformation. However, we would like to point out several considerations. It is evident from numerous imaging studies across various brain regions in freely moving animals, utilizing brain motion registration, that brain motion of the same scale (a few microns), as that observed in our studies, also occurs in the absence of head fixation (e.g., Glas et al, 2019; Zong et al 2021). In our system, the head-fixed mouse is locomoting on a cantilevered (spring-like) running wheel (see also Ramesh et al., 2018), which dissipates most, albeit not all, upward and forward forces applied to the skull during locomotion. Furthermore, the position of the headpost, anterior to where the mouse's paws touch the wheel, makes it hard for the mouse to push straight up and apply forces to the skull. We have updated the text in the methods section (Running wheel habituation) to address this. In our previous work (See Figure 2B in Blaeser et al. 2023), we found a substantial subset of afferents showing an increase in calcium activity that began after each bout of locomotion had terminated, and that lasted for many seconds, suggesting that skull flexing during locomotion may not play a leading role. Finally, we proposed in that study that meningeal deformations play a major role in the afferent response, given our findings of (i) sigmoidal stimulus-response curves between afferent activity and meningeal deformation and (ii) of different afferents that track scaling deformations along different axes. It is unlikely that all of these are related to any residual forces generated from skull deformations.

      4) What is the mechanism by which afferents initiate the calcium wave during the CSD itself? Is this mechanical pressure due to swelling of the cortex during the wave? If so, why does the CSD have no impact on the deformation parameters? It seems that this cortical swelling would have some influence on these values unless the measurements of these values are taken well after cortical swelling subsides. Related to point 1 above, it is not clear when these measurements are taken post-CSD.

      We provide, for the first time, evidence that CSD evokes local calcium elevation in meningeal afferent fibers in a manner that is incongruent with action potential propagation, as the activity gradually advances along individual afferents across many seconds during the wave. As indicated in Figure 1H, we measured these changes during the first 2 minutes post-CSD. Based on the reviewer’s question, we have now addressed whether mechanical changes occurring in the cortex in the wake of CSD might be responsible for the acute afferent activation we observed. We now include new data (Results, “Acute afferent activation is not related to CSD-evoked meningeal deformation” and Figure S2) showing an acute phase of meningeal deformation (as expected given the changes in extracellular fluid volume) lasting 40-80 seconds following the induction of CSD. Our data suggests, however, that these meningeal deformations are unlikely to be the main driver of the acute afferent calcium response. We propose that, based on the speed of the afferent calcium wave propagation and the distinct dynamics of calcium activity as compared to the dynamics of the deformations, the acute afferent response is more likely to be mediated by the spread of algesic mediators (e.g., glutamate, K+ ATP) and their diffusion into the overlying meninges.

      Because the peri-CSD meningeal deformations return to baseline soon after the cessation of the CSD wave, they are unlikely to affect our analyses of post-CSD changes in afferent sensitivity in the following 2 hours. This is also supported by our data (see Figure 3F-H) showing similar locomotion-related deformations pre- and post-CSD, which were measured after the deformations related to the CSD itself had subsided.

      5) How does CSD cause suppression of afferent activity? This is not discussed. It is probably a good idea in this discussion to reinforce that suppression in this case is suppression of the calcium response and not necessarily suppression of all neuronal activity.

      The mechanism underlying the suppression of afferent activity remains unclear. We now discuss the following points:

      First, the pattern of afferent responses resembles the rapid loss of cortical activity in the wake of a CSD, but its faster recovery points to a mechanism distinct from the pre-and post-synaptic changes responsible for the silencing of cortical activity (Sawant-Pokam et al., 2017; Kucharz and Lauritzen, 2018). Whether CSD drives the local release of mediators capable of reducing afferent excitability and spiking dynamics will require further studies.

      Second, the reviewer proposes that the suppressed calcium activity we observed in ~20% of the afferents immediately following CSD may reflect a decreased calcium response independent of afferent spiking activity. Such a process could theoretically involve factors influencing the GCaMP fluorescence (see also our response to Reviewer #3) and/or factors modifying the afferents’ spiking-to-calcium coupling. We note that if a CSD-related factor could modify the calcium response independent of afferent spiking, one would expect a more consistent effect across axons, reflected as a reduced signal in a larger proportion of the afferents, which we did not observe.

      6) How do the authors interpret the influence of CSD on locomotor activity? There was a decrease in bouts but the bouts themselves showed similar patterns after CSD. Is CSD merely inhibiting the initiation of bouts? Is this consistent with what CSD is known to do to motor activity? And again related to point 1, how long after CSD were these measurements taken? Were there changes in locomotor activity during the actual CSD compared to post-CSD?

      To the best of our knowledge, there is very little data on the effect of CSD on motor activity, making it challenging to engage in further speculation regarding the mechanisms underlying the preservation of running bouts patterns post-CSD. Houben et al. (2017) described a similar reduction in locomotion in mice, corresponding to decreased motor cortex (M1) activity, and preservation of intermittent locomotion bouts. In the revised Results section, we now provide information about the cessation of locomotor activity during the CSD wave and have added information regarding the measurement of locomotion following CSD.

      7) The authors mention the caveats of prior work where the skull is open and is thus depressurized. Is this not also the case here given there is a hole in the skull needed to induce CSD?

      Unlike previous electrophysiological studies, which involved several large openings (~2x2 mm), including at the site of the afferents’ receptive field, our study involved only a small burr hole located remotely (1.5 mm) from the frontal edge of our imaging window. As noted in our response to Reviewer #1, this burr hole (~0.5 mm diameter) was unlikely to produce inflammation at the imaging site or cause depressurization as it was sealed with a silicone plug throughout the experiment.

      8) The authors should check the %'s and the numbers in the pie chart for Figure 4. Line 224 says 53 is 22% but it does not look this way from the chart.

      The 22% reported is the percentage of afferents that developed sensitivity post-CSD among all the non-sensitive ones pre-CSD. The pie chart illustrates only afferents that were deemed sensitive before and/or after the CSD. We removed the % to clarify.

      9) Line 319 mentions that CSD causes "powerful calcium transients" in sensory neurons but it is not clear what is meant by powerful if there are no downstream effects of these transients being measured. The speculation is that these calcium transients could cause transmitter release, which would be an important observation in the absence of AP firing, but there are no data evaluating whether this is the case.

      We changed the term to “robust”

      Reviewer #3 (Public Review):

      Summary:

      Blaeser et al. set out to explore the link between CSD and headache pain. How does an electrochemical wave in the brain parenchyma, which lacks nociceptors, result in pain and allodynia in the V1-3 distribution? Prior work had established that CSD increased the firing rate of trigeminal neurons, measured electrophysiologically at the level of the peripheral ganglion. Here, Blaeser et al. focus on the fine afferent processes of the trigeminal neurons, resolving Ca2+ activity of individual fibers within the meninges. To accomplish these experiments, the authors injected AAV encoding the Ca2+ sensitive fluorophore GCamp6s into the trigeminal ganglion, and 8 weeks later imaged fluorescence signals from the afferent terminals within the meninges through a closed cranial window. They captured activity patterns at rest, with locomotion, and in response to CSD. They found that mechanical forces due to meningeal deformations during locomotion (shearing, scaling, and Z-shifts) drove non-spreading Ca2+ signals throughout the imaging field, whereas CSD caused propagating Ca2+ signals in the trigeminal afferent fibers, moving at the expected speed of CSD (3.8 mm/min). Following CSD, there were variable changes in basal GCamp6s signals: these signals decreased in the majority of fibers, signals increased (after a 25 min delay) in other fibers, and signals remained unchanged in the remainder of fibers. Bouts of locomotion were less frequent following CSD, but when they did occur, they elicited more robust GCamp6s signals than pre-CSD. These findings advance the field, suggesting that headache pain following CSD can be explained on the basis of peripheral cranial nerve activity, without invoking central sensitization at the brain stem/thalamic level. This insight could open new pathways for targeting the parenchymal-meningeal interface to develop novel abortive or preventive migraine treatments.

      Strengths:

      The manuscript is well-written. The studies are broadly relevant to neuroscientists and physiologists, as well as neurologists, pain clinicians, and patients with migraine with aura and acephalgic migraine. The studies are well-conceived and appear to be technically well-executed.

      Weaknesses:

      1) Lack of anatomic confirmation that the dura were intact in these studies: it is notoriously challenging to create a cranial window in mouse skull without disrupting or even removing the dura. It was unclear which meningeal layers were captured in the imaging plane. Did the visualized trigeminal afferents terminate in the dura, subarachnoid space, or pia (as suggested by Supplemental Fig 1, capturing a pial artery in the imaging plane)? Were z-stacks obtained, to maintain the imaging plane, or to follow visualized afferents when they migrated out of the imaging plane during meningeal deformations?

      We agree that avoiding disruption of the dura is challenging. Indeed, it took many months of practice before conducting the experiments in this manuscript to master methods for a craniotomy that spared the dura.

      We addressed the issue of meningeal irritation due to cranial window surgery in our previous work (Blaeser et al., 2023). In brief, we conducted vascular imaging using the same cranial window approach and showed no leakage of macromolecules from dural or pial vessels anywhere within the imaging window at 2-6 weeks after the surgery (Figure S1D in Blaeser et al. 2022). This data suggested no ongoing meningeal inflammation below the window. The very low level of ongoing activity we observed at baseline also suggests a lack of an inflammatory response that could lead to afferent sensitization before CSD. This is now mentioned in the Discussion.

      We conducted volumetric imaging for three main reasons: 1) To capture the activity of afferents throughout the meningeal volume. In our volumetric imaging approach, including in this work, we observed afferent calcium signals throughout the meningeal thickness (see Figure 5 in Blaeser et al. 2022). However, the majority of afferents were localized to the most superficial 20 microns (Figure S1E in Blaeser et al. 2022), suggesting that we mostly recorded the activity of dural afferents; 2) to enable simultaneous quantification of three-dimensional deformation and the activity of afferents throughout the thickness of the meninges. This allowed us to determine whether changes in mechanosensitivity could involve augmented activity to intracranial mechanical forces that produced meningeal deformation along the Z-axis of the meninges (e.g., increased intracranial pressure); 3) to provide a direct means to confirm that the afferent GCaMP fluorescent changes we observed were not due to artifacts related to meningeal motion along the Z-axis. We have now added this information to the “Two-photon imaging” section of the Methods.

      2) Findings here, from mice with chronic closed cranial windows, failed to fully replicate prior findings from rats with acute open cranial windows. While the species, differing levels of inflammation and intracranial pressure in these two preparations may contribute, as the authors suggested, the modality of measuring neuronal activity could also contribute to the discrepancy. In the present study, conclusions are based entirely on fluorescence signals from GCamp6s, whereas prior rat studies relied upon multiunit recordings/local field potentials from tungsten electrodes inserted in the trigeminal ganglion.

      As a family, GCamp6 fluorophores are strongly pH dependent, with decreased signal at acidic pH values (at matched Ca2+ concentration). CSD induces an impressive acidosis transient, at least in the brain parenchyma, so one wonders whether the suppression of activity reported in the wake of CSD (Figure 2) in fact reflects decreased sensitivity of the GCamp6 reporter, rather than decreased activity in the fibers. If intracellular pH in trigeminal afferent fibers acidifies in the wake of CSD, GCamp6s fluorescence may underestimate the actual neuronal activity.

      Previous in vivo rodent studies observed a tissue acidosis transient that peaks during the DC shift corresponding to the wavefront of the spreading depolarization, and lasting for ~ 10 min. (Mutch and Hansen, 1984). Since we observed a massive increase in afferent calcium activity with a propagation pattern resembling the cortical wave, it is unlikely that the cortical acidosis during the CSD wave strongly affected the GCaMP signal in the overlying meninges. Furthermore, if cortical acidosis non-discriminately affects the GCaMP signal, one would expect a more consistent effect across axons, reflected as a reduced calcium signal in a larger proportion of the afferents, which we did not observe. Finally, the finding that in affected afferents, decreased calcium activity lasted for > 20 min – a time point when cortical acidosis has fully recovered - points to a distinct underlying mechanism. We also note that any residual acidosis would not confound our main finding of increased calcium responses to meningeal deformation at later periods post-CSD, as acidosis should, if anything, decrease calcium-related fluorescence.

      The authors might consider injecting an AAV encoding a pHi sensor to the trigeminal ganglion, and evaluating pHi during and after CSD, to assess how much this might be an issue for the interpretation of GCamp6s signals. Alternatively, experiments assessing trigeminal fiber (or nerve/ganglion) activity by electrophysiology or some other orthologous method would strengthen the conclusions.

      Please see our comment above regarding the short duration of the pH changes post-CSD.

      N's are generally reported as # of afferents, obscuring the number of technical/biological replicates (# of imaging sessions, # of locomotion bouts, # of CSDs induced, # of animals).

      We now report the number of replicates (# of afferent, # of CSD events, and # of mice).

      Fig 1F trace over the heatmap is not explained in the figure legend. Is this the speed of the running wheel? Is it the apparent propagation rate of the GCamp6s transient through the imaging field?

      We have added to the legend of Figure 1 that the trace in panel F depicts locomotion speed.

    2. eLife assessment

      This fundamental study explored the impact of migraine-related cortical spreading depression (CSD) on the firing of nerves innervating the coverings of the brain that are considered the putative source of migraine-related pain. Using convincing approaches they show that these responses are altered in response to mechanical deformation of the brain coverings. Given that migraine is characterized by worsening head pain in response to movement, the findings offer a potential mechanism that may explain this clinical phenomenon.

    3. Reviewer #1 (Public Review):

      Summary:<br /> Herein, Blaeser et al. explored the impact of migraine-related cortical spreading depression (CSD) on the calcium dynamics of meningeal afferents that are considered the putative source of migraine-related pain. Critically previous studies have identified widespread activation of these meningeal afferents following CSD; however, most studies of this kind have been performed in anesthetized rodents. By conducting a series of technically challenging and compelling calcium imaging experiments in conscious head fixed mice they find in contrast that a much smaller proportion of meningeal afferents are persistently activated following CSD. Instead, they identify that post-CSD responses are differentially altered across a wide array of afferents, including increased and decreased responses to mechanical meningeal deformations and activation of previously non-responsive afferents following CSD. Given that migraine is characterized by worsening head pain in response to movement, the findings offer a potential mechanism that may explain this clinical phenomenon.

      Strengths:<br /> Using head fixed conscious mice overcomes the limitations of anesthetized preps and the potential impact of anaesthesia on meningeal afferent function which facilitated novel results when compared to previous anesthetized studies. Further, the authors used a closed cranial window preparation to maximize normal physiological states during recording, although the introduction of a needle prick to induce CSD will have generated a small opening in the cranial preparation, rendering it not fully closed as suggested. However, technical issues with available AAV's and alternate less invasive triggering methodologies necessitate the current approach.

      Weaknesses:<br /> Although this is a well conducted technically challenging study that has added valuable knowledge on the response of meningeal afferents the study would have benefited from the inclusion of more female mice. Migraine is a female dominant condition and an attempt to compare potential sex-differences in afferent responses would undoubtedly have improved the outcome. The authors report potential sex-specific effects on AAV transfection rates between males and females which have contributed to this imbalance.

      The authors imply that the current method shows clear differences when compared to older anaesthetized studies; however, many of these were conducted in rats and relied on recording from the trigeminal ganglion. Attempts to address this point have proven difficult due to limited GCaMP signalling in anaesthetised mice, meaning that technical differences cannot be ruled out.

    4. Reviewer #2 (Public Review):

      This is an interesting study examining the question of whether CSD sensitizes meningeal afferent sensory neurons leading to spontaneous activity or whether CSD sensitizes these neurons to mechanical stimulation related to locomotion. Using two-photon in vivo calcium imaging based on viral expression of GCaMP6 in the TG, awake mice on a running wheel were imaged following CSD induction by cortical pinprick. The CSD wave evoked a rise in intracellular calcium in many sensory neurons during the propagation of the wave but several patterns of afferent activity developed after the CSD. The minority of recorded neurons (10%) showed spontaneous activity while slightly larger numbers (20%) showed depression of activity, the latter pattern developed earlier than the former. The vast majority of neurons (70%) were unaffected by the CSD. CSD decreased the time spent running and the numbers of bouts per minute but each bout was unaffected by CSD. There also was no influence of CSD on the parameters referred to as meningeal deformation including scale, shear, and Z-shift. Using GLM, the authors then determine that there there is an increase in locomotion/deformation-related afferent activity in 51% of neurons, a decrease in 12% of neurons, and no change in 37%. GLM coefficients were increased for deformation related activity but not locomotion related activity after CSD. There also were an increase in afferents responsive to locomotion/deformation following CSD that were previously silent. This study shows that unlike prior reports, CSD does not lead to spontaneous activity in the majority of sensory neurons but that it increases sensitivity to mechanical deformation of the meninges. This has important implications for headache disorders like migraine where CSD is thought to contribute to the pathology in unclear ways with this new study suggesting that it may lead to increased mechanical sensitivity characteristic of migraine attacks.

    5. Reviewer #3 (Public Review):

      Summary: In this manuscript, Blaeser et al. explore the link between CSD and headache pain. How does an electrochemical wave in the brain parenchyma, which lacks nociceptors, result in pain and allodynia in the V1-3 distribution? Prior work had established that CSD increased the firing rate of trigeminal neurons, measured electrophysiologically at the level of the peripheral ganglion. Here, Blaeser et al. focus on the fine afferent processes of the trigeminal neurons, resolving Ca2+ activity of individual fibers within the meninges. To accomplish these experiments, the authors injected AAV encoding the Ca2+ sensitive fluorophore GCamp6s into the trigeminal ganglion, and 8 weeks later imaged fluorescence signals from the afferent terminals within the meninges through a closed cranial window. They captured activity patterns at rest, with locomotion, and in response to CSD. They found that mechanical forces due to meningeal deformations during locomotion (shearing, scaling, and Z-shifts) drove non-spreading Ca2+ signals throughout the imaging field, whereas CSD caused propagating Ca2+ signals in the trigeminal afferent fibers, moving at the expected speed of CSD (3.8 mm/min). Following CSD, there were variable changes in basal GCamp6s signals: these signals were unchanged in the majority of fibers, signals increased (after a ~20 min delay) in 10% of fibers, and signals decreased in 20% of fibers. Bouts of locomotion were less frequent following CSD, but when they did occur, they elicited more robust GCamp6s signals than pre-CSD. These findings advance the field, suggesting that headache pain following CSD can be explained on the basis of peripheral cranial nerve activity, without invoking central sensitization at the brain stem/thalamic level. This insight could open new pathways for targeting the parenchymal-meningeal interface to develop novel abortive or preventive migraine treatments.

      Strengths: The manuscript is well-written. The studies are broadly relevant to neuroscientists and physiologists, as well as neurologists, pain clinicians, and patients with migraine with aura and acephalgic migraine. The studies are well-conceived and appear to be technically well-executed.

      Weaknesses: In the present study, conclusions are based entirely on fluorescence signals from GCamp6s. Fluorescence experiments should be interpreted cautiously in the context of CSD. GCamp6 fluorophores are strongly pH dependent, with decreased signal at acidic pH values (at matched Ca2+ concentration). CSD induces an impressive acidosis transient in the brain parenchyma, so one wonders whether the suppression of activity reported in the wake of CSD (Figure 2) in fact reflects decreased sensitivity of the GCamp6 reporter, rather than decreased activity in the fibers. If intracellular pH in trigeminal afferent fibers acidifies in the wake of CSD, GCamp6s fluorescence may underestimate the actual neuronal activity.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This valuable paper examines gene expression differences between male and female individuals over the course of flower development in the dioecious angiosperm Trichosantes pilosa. Male-biased genes evolve faster than female-biased and unbiased genes, which is frequently observed in animals, but this is the first report of such a pattern in plants. In spite of the limited sample size, the evidence is mostly solid and the methods appropriate for a non-model organism. The resources produced will be used by researchers working in the Cucurbitaceae, and the results obtained advance our understanding of the mechanisms of plant sexual reproduction and its evolutionary implications: as such they will broadly appeal to evolutionary biologists and plant biologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      The evolution of dioecy in angiosperms has significant implications for plant reproductive efficiency, adaptation, evolutionary potential, and resilience to environmental changes. Dioecy allows for the specialization and division of labor between male and female plants, where each sex can focus on specific aspects of reproduction and allocate resources accordingly. This division of labor creates an opportunity for sexual selection to act and can drive the evolution of sexual dimorphism.

      In the present study, the authors investigate sex-biased gene expression patterns in juvenile and mature dioecious flowers to gain insights into the molecular basis of sexual dimorphism. They find that a large proportion of the plant transcriptome is differentially regulated between males and females with the number of sex-biased genes in floral buds being approximately 15 times higher than in mature flowers. The functional analysis of sex-biased genes reveals that chemical defense pathways against herbivores are up-regulated in the female buds along with genes involved in the acquisition of resources such as carbon for fruit and seed production, whereas male buds are enriched in genes related to signaling, inflorescence development and senescence of male flowers. Furthermore, the authors implement sophisticated maximum likelihood methods to understand the forces driving the evolution of sex-biased genes. They highlight the influence of positive and relaxed purifying selection on the evolution of male-biased genes, which show significantly higher rates of non-synonymous to synonymous substitutions than female or unbiased genes. This is the first report (to my knowledge) highlighting the occurrence of this pattern in plants. Overall, this study provides important insights into the genetic basis of sexual dimorphism and the evolution of reproductive genes in Cucurbitaceae.

      Reviewer #2 (Public Review):

      Summary:

      This study uses transcriptome sequence from a dioecious plant to compare evolutionary rates between genes with male- and female-biased expression and distinguish between relaxed selection and positive selection as causes for more rapid evolution. These questions have been explored in animals and algae, but few studies have investigated this in dioecious angiosperms, and none have so far identified faster rates of evolution in male-biased genes (though see Hough et al. 2014 https://doi.org/10.1073/pnas.1319227111).

      Strengths:

      The methods are appropriate to the questions asked. Both the sample size and the depth of sequencing are sufficient, and the methods used to estimate evolutionary rates and the strength of selection are appropriate. The data presented are consistent with faster evolution of genes with male-biased expression, due to both positive and relaxed selection.

      This is a useful contribution to understanding the effect of sex-biased expression in genetic evolution in plants. It demonstrates the range of variation in evolutionary rates and selective mechanisms, and provides further context to connect these patterns to potential explanatory factors in plant diversity such as the age of sex chromosomes and the developmental trajectories of male and female flowers.

      Weaknesses:

      The presence of sex chromosomes is a potential confounding factor, since there are different evolutionary expectations for X-linked, Y-linked, and autosomal genes. Attempting to distinguish transcripts on the sex chromosomes from autosomal transcripts could provide additional insight into the relative contributions of positive and relaxed selection.

      Reviewer #3 (Public Review):

      The potential for sexual selection and the extent of sexual dimorphism in gene expression have been studied in great detail in animals, but hardly examined in plants so far. In this context, the study by Zhao, Zhou et al. al represents a welcome addition to the literature.

      Relative to the previous studies in Angiosperms, the dataset is interesting in that it focuses on reproductive rather than somatic tissues (which makes sense to investigate sexual selection), and includes more than a single developmental stage (buds + mature flowers).

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      I have reviewed this new version and find that it now addresses some of the shortcomings of the previous manuscript. However, several important limitations still remain:

      1) The conclusion that sex-linked genes contribute relatively little to the patterns described is important and would be worth including in the manuscript briefly (not just the response letter), focusing for instance on the overall comparable proportions of sex-linked genes among male-biased (3/343=0.087%), female-biased (19/1145=1.66%) and unbiased genes (36/2378=1.51%).

      Authors’ response: Thank you for your advice. We have added these sentences in “Discussion” section (Lines 492-499).

      2) The new sentence included in the results "we also found that most of them were members of different gene families generated by gene duplication" is too vague. The motivation of this analysis is not explained, leaving the intended message unclear.

      Authors’ response: In the previous revision, as stressed by reviewer #1 “(2) Paragraph (407-416) describes the analysis of duplicated genes under relaxed selection but there is no mention of this in the results”, we added the sentence “we also found that most of them were members of different gene families generated by gene duplication” in “Relaxed selection” paragraph of the results. Accordingly, in “Discussion” section, we discussed the associations between gene duplication and relaxed selection (Lines 461-473).

      Following your suggestion, we revised the results (Lines 304-307) to “Using the RELAX model, we detected that 18 out of 343 OGs (5.23%) showed significant evidence of relaxed selection (K = 0.0184–0.6497) (Tables S9). Most of the 18 OGs are members of different gene families generated by gene duplication (Table S13)”. This makes it more coherent with the discussion.

      3) The sentences "given that dN/dS values of sex-biased genes were higher due to codon usage bias..." are very confusing. I do not understand the argument being made here. I do not see why "lower dS rates would be expected in sex-biased genes ..."

      Authors’ response: We respectfully argue that codon usage bias was positively related to synonymous substitution rates. That is, stronger codon usage bias may be related to higher synonymous substitution rates (Parvathy et al., 2022). Lower ENC values represent stronger codon usage bias. So, if ω (dN/dS) values of sex-biased genes are higher due to codon usage bias, we expect lower dS rates (That is, higher ENC values). Please refer to the relevant papers (e. g. Darolti et al., 2018; Catalan et al., 2018; Schrader et al., 2021, cited in the references of the paper).

      4) The manuscript now reports the proportion of unitigs annotated by similarity with a number of species. While this is an interesting observation, the reviewer was actually asking for a comparison between the number of unitigs (59,051) and the number of genes annotated in a typical cucurbitaceae genome. This would give an indication of the level of redundancy of the de novo assembled transcriptome.

      Authors’ response: We admit that in the final assembly, transcripts may be overestimated. We respectfully suggest that it may be inappropriate to assess the redundancy of the de novo assembled transcriptome by comparing the transcriptome sequences with the genomic sequences. An appropriate approach is to compare transcriptome sequences and transcriptome sequences among different species. For example, Hu et al., 2020 (reference cited in the paper) obtained 145,975 non-redundant unigenes from flower buds of female and male plants in Trichosanthes kirilowii. Mohanty et al. (2017) obtained 71,823 non-redundant unigenes from flower buds of female and male plants in Coccinia grandis.

      Reference:

      Mohanty JN, Nayak S, Jha S, Joshi RK. 2017. Transcriptome profiling of the floral buds and discovery of genes related to sex-differentiation in the dioecious cucurbit Coccinia grandis (L.) Voigt. Gene. 626: 395-406.

      5) From reading the text I could not understand the extent to which the permutation test actually agreed with the Wilcoxon rank sum test. The text says that the results were "almost consistent", which is too vague. This paragraph should be clarified.

      Authors’ response: We performed permutation test for sex-biased genes in floral buds and flowers at anthesis. However, only in floral buds, the results of both tests (permutation test and Wilcoxon rank sum test) are significant. Taking your suggestions in consideration, we have revised them as “Additionally, we found that only in floral buds, there were significant differences in ω values in the results of ‘free-ratio’ model (female-biased versus male-biased genes, P = 0.04282 and male-biased versus unbiased genes, P = 0.01114) and ‘two-ratio’ model (female-biased versus male-biased genes, P = 0.01992 and male-biased versus unbiased genes, P = 0.02127, respectively) by permutation t test, which is consistent with the results of Wilcoxon rank sum test.(Lines 273-280)”.

      6) The paragraph on the link between codon usage and dN/dS is very unclear and quite unnecessary. I would suggest to simply remove lines 312-323.

      Authors’ response: We respectfully argue that codon usage bias is one of the most important factors for higher rates of sequence evolution. Please refer to Darolti et al. (2018), Catalan et al. (2018) and Schrader et al. (2021) (cited in the references of the paper). We retain these lines here.

      7) The discussion contains many unnecessary repeats from the introduction and results section. I suggest shortening drastically at several places, including:

      • remove lines 367-369

      Authors’ response: Thank you for your suggestion. We revised these lines to “In this study, we compared the expression profiles of sex-biased genes between sexes and two tissue types, investigated whether sex-biased genes exhibited evidence of rapid evolutionary rates of protein sequences and identified the evolutionary forces responsible for the observed patterns in the dioecious Trichosanthes pilosa (Lines 369-373)”.

      We removed the sentence “We compared the expression profiles of sex-biased genes between sexes and two tissue types and examined the signatures of rapid sequence evolution for sex-biased genes, as well as the contributions of potential evolutionary forces. (Lines 374-376)”.

      • remove lines 395-410

      Authors’ response: Here we mainly discussed the possible associations between sex-biased genes, adaptation and sexual dimorphic traits. We retain them here for clarity.

      • remove lines 449-483, as they are almost entirely repetitions of elements already made clear in the results section.

      Authors’ response: In these paragraphs, we discussed reasons that lead to relaxed purifying selection for sex-biased genes. They are coherent with the results section. We retain them to make it clearer.

      Minor comments:

      • line 146: remove "However"

      Authors’ response: We have revised it.

      • line 187: "female flower buds tend to masculinize": the meaning is obscure

      Authors’ response: We revised them as “Using hierarchical clustering analysis, we evaluated different levels of gene expression across sexes and tissues (Fig. 2C). Gene expression for female floral buds clustered most distantly from expression in female flowers at anthesis. However, expression in male floral buds clustered with expression in female flowers at anthesis, suggesting that male floral buds maybe tend to feminization in the early stages of floral development.”.

      • line 226: "we sequenced transcriptomes of T. pilosa": rather say "we used the transcriptomes described above for T. pilosa"

      Authors’ response: We have revised it.

      • line 279: the meaning of "branch-site model A and branch site model null" is still not made clear.

      Authors’ response: We have revised it.

      • line 324: change to: "we also analysed whether female-biased and unbiased genes underwent... "

      Authors’ response: We have revised it.

    2. eLife assessment

      This valuable paper examines gene expression differences between male and female individuals over the course of flower development in the dioecious angiosperm Trichosantes pilosa. Male-biased genes evolve faster than female-biased and unbiased genes, which is frequently observed in animals, but this is the first report of such a pattern in plants. In spite of the limited sample size, the evidence is mostly solid and the methods appropriate for a non-model organism. The resources produced will be used by researchers working in the Cucurbitaceae, and the results obtained advance our understanding of the mechanisms of plant sexual reproduction and its evolutionary implications: as such they will broadly appeal to evolutionary biologists and plant biologists.

    3. Reviewer #1 (Public Review):

      The evolution of dioecy in angiosperms has significant implications for plant reproductive efficiency, adaptation, evolutionary potential, and resilience to environmental changes. Dioecy allows for the specialization and division of labor between male and female plants, where each sex can focus on specific aspects of reproduction and allocate resources accordingly. This division of labor creates an opportunity for sexual selection to act and can drive the evolution of sexual dimorphism.

      In the present study, the authors investigate sex-biased gene expression patterns in juvenile and mature dioecious flowers to gain insights into the molecular basis of sexual dimorphism. They find that a large proportion of the plant transcriptome is differentially regulated between males and females with the number of sex-biased genes in floral buds being approximately 15 times higher than in mature flowers. The functional analysis of sex-biased genes reveals that chemical defense pathways against herbivores are up-regulated in the female buds along with genes involved in the acquisition of resources such as carbon for fruit and seed production, whereas male buds are enriched in genes related to signaling, inflorescence development and senescence of male flowers. Furthermore, the authors implement sophisticated maximum likelihood methods to understand the forces driving the evolution of sex-biased genes. They highlight the influence of positive and relaxed purifying selection on the evolution of male-biased genes, which show significantly higher rates of non-synonymous to synonymous substitutions than female or unbiased genes. This is the first report (to my knowledge) highlighting the occurrence of this pattern in plants. Overall, this study provides important insights into the genetic basis of sexual dimorphism and the evolution of reproductive genes in Cucurbitaceae.

    4. Reviewer #2 (Public Review):

      Summary:

      This study uses transcriptome sequence from a dioecious plant to compare evolutionary rates between genes with male- and female-biased expression and distinguish between relaxed selection and positive selection as causes for more rapid evolution. These questions have been explored in animals and algae, but few studies have investigated this in dioecious angiosperms, and none have so far identified faster rates of evolution in male-biased genes (though see Hough et al. 2014 https://doi.org/10.1073/pnas.1319227111).

      Strengths:

      The methods are appropriate to the questions asked. Both the sample size and the depth of sequencing are sufficient, and the methods used to estimate evolutionary rates and the strength of selection are appropriate. The data presented are consistent with faster evolution of genes with male-biased expression, due to both positive and relaxed selection.

      This is a useful contribution to understanding the effect of sex-biased expression in genetic evolution in plants. It demonstrates the range of variation in evolutionary rates and selective mechanisms, and provides further context to connect these patterns to potential explanatory factors in plant diversity such as the age of sex chromosomes and the developmental trajectories of male and female flowers.

      Weaknesses:

      The presence of sex chromosomes is a potential confounding factor, since there are different evolutionary expectations for X-linked, Y-linked, and autosomal genes. Attempting to distinguish transcripts on the sex chromosomes from autosomal transcripts could provide additional insight into the relative contributions of positive and relaxed selection.

    5. Reviewer #3 (Public Review):

      The potential for sexual selection and the extent of sexual dimorphism in gene expression have been studied in great detail in animals, but hardly examined in plants so far. In this context, the study by Zhao, Zhou et al. al represents a welcome addition to the literature.

      Relative to the previous studies in Angiosperms, the dataset is interesting in that it focuses on reproductive rather than somatic tissues (which makes sense to investigate sexual selection), and includes more than a single developmental stage (buds + mature flowers).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The apicoplast, a non-photosynthetic vestigial chloroplast, is a key metabolic organelle for the synthesis of certain lipids in apicomplexan parasites. Although it is clear metabolite exchange between the parasite cytosol and the apicoplast must occur, very few transporters associated with the apicoplast have been identified. The current study combines data from previous studies with new data from biotin proximity labeling to identify new apicoplast resident proteins including two putative monocarboxylate transporters termed MCT1 and MCT2. The authors conduct a thorough molecular phylogenetic analysis of the newly identified apicoplast proteins and they provide compelling evidence that MCT1 and MCT2 are necessary for normal growth and plaque formation in vitro along with maintenance of the apicoplast itself. They also provide indirect evidence for a possible need for these transporters in isoprenoid biosynthesis and fatty acid biosynthesis within the apicoplast. Finally, mouse infection experiments suggest that MCT1 and MCT2 are required for normal virulence, with MCT2 completely lacking at the administered dose. Overall, this study is generally of high quality, includes extensive quantitative data, and significantly advances the field by identifying several novel apicoplast proteins together with establishing a critical role for two putative transporters in the parasite. The study, however, could be further strengthened by addressing the following aspects:

      Response: We thank very much the reviewer for his/her positive evaluation of our work. To address the detailed function of the transporters, in the past three months, we have re-constructed plasmids (with codon-optimized DNA sequences of the genes) for expression of the transporters in a regular expression E. coli strain (BL21DE3) and in a pyruvate import knockout E. coli strain (a gift from Prof. Kirsten Jung), to examine the transport capability in vitro. And, we have also re-constructed a new plasmid containing a new leading peptide for targeting the pyruvate sensor PyronicSF to the apicoplast in the parasite, to probe the possible substrate pyruvate. However, we did not successfully observe expression of the transporters in the above E. coli strains, and we were unable to target the sensor to the correct localization (the apicoplast) in the parasite. As a result, all efforts have led the study to the current version of manuscript on the functional identification of transporters. We will keep working on this aspect, attempting to dissect out the exact transport function of the transporters in the future. In the current manuscript, we have discussed the limitations of our study in the last part of the manuscript.

      Main comments

      1) The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of IPP is only supported by indirect measurements (effects on host GFP uptake or trafficking, possibly due to effects on IPP dependent proteins such as rabs, and mitochondrial membrane potential, possibly due to effects on IPP dependent ubiquinone). This conclusion would be more strongly supported by directly measuring levels of IPP. If there are technical limitations that prevent direct measurement of IPP then the author should note such limitations and acknowledge in the discussion that the conclusion is based on indirect evidence.

      Response: We thank the reviewer very much for the suggestions. We have tried to establish the measurement of IPP using a commercial company in recent months, yet we have not been successful in making the assay work. Considering the problem of indirect evidence, we have discussed this limitation in the discussion.

      2) The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of fatty acids is also poorly supported by the data. The authors do not distinguish between the lower fatty acid levels being due to reduced synthesis of fatty acids, reduced salvage of host fatty acids, or both. Indeed, the authors provide evidence that parasite endocytosis of GFP is dependent on AMT1 and AMT2. Host GFP likely enters the parasite within a membrane bound vesicle derived from the PVM. The PVM is known to harbor host-derived lipids. Hence, it is possible that some of the decrease in fatty acid levels could be due to reduced lipid salvage from the host. Experiments should be conducted to measure the synthesis and salvage of fatty acids (e.g., by metabolic flux analysis), or the authors should acknowledge that both could be affected.

      Response: We thank the reviewer very much for comments and suggestions. We partially agree with the comments that the depletion of transporters could affect lipids scavenged from the host cells, as endocytic vesicles are indeed derived from the parasite plasma membrane at the micropore and potentially from the host cell endo-membrane system, as demonstrated with the micropore endocytosis in our previous study (pmid: 36813769). Our latest study has addressed this by showing that the endocytic trafficking of GFP vesicles is regulated by prenylation of proteins (e.g. Rab1B and YKT6.1), depletion of which resulted in diffusion of GFP vesicles, but not disappearance of GFP vesicles in the parasites (pmid: 37548452), indicating that the vesicles (containing lipids) enter the parasites. In the current manuscript, the percentage of parasites containing GFP foci was significantly reduced in AMT1/AMT2-depleted parasites, and instead, parasites containing GFP diffusion appeared and the percentage was almost equal to the reduced level of parasites with GFP foci. These results suggested that endocytic vesicles (e.g. GFP vesicles) were continuously generated by the micropore in the parasites depleted with AMT1/AMT2, and that the vesicle trafficking was regulated by proteins modified by IPP derivatives that were derived from the apicoplast. Based on these observations, we considered that lipids in endocytic vesicles should not contribute to the reduced level of fatty acids and other lipids in parasites depleted with AMT1/AMT2. We have added in a short discussion concerning the fatty acids and lipids reduced in the parasites.

      Reviewer #2 (Public Review):

      In this study Hui Dong et al. identified and characterized two transporters of the monocarboxylate family, which they called Apcimplexan monocarboxylate 1 and 2 (AMC1/2) that the authors suggest are involved in the trafficking of metabolites in the non-photosynthetic plastid (apicoplast) of Toxoplasma gondii (the parasitic agent of human toxoplasmosis) to maintain parasite survival. To do so they first identified novel apicoplast transporters by conducting proximity-dependent protein labeling (TurboID), using the sole known apicoplast transporter (TgAPT) as a bait. They chose two out of the three MFS transporters identified by their screen based and protein sequence similarity and confirmed apicoplast localisation. They generated inducible knock down parasite strains for both AMC1 and AMC2, and confirmed that both transporters are essential for parasite intracellular survival, replication, and for the proper activity of key apicoplast pathways requiring pyruvate as carbon sources (FASII and MEP/DOXP). Then they show that deletion of each protein induces a loss of the apicoplast, more marked for AMC2 and affects its morphology both at its four surrounding membranes level and accumulation of material in the apicoplast stroma. This study is very timely, as the apicoplast holds several important metabolic functions (FASII, IPP, LPA, Heme, Fe-S clusters...), which have been revealed and studied in depth but no further respective transporter have been identified thus far. hence, new studies that could reveal how the apicoplast can acquire and deliver all the key metabolites it deals with, will have strong impact for the parasitology community as well as for the plastid evolution communities. The current study is well initiated with appropriate approaches to identify two new putatively important apicoplast transporters, and showing how essential those are for parasite intracellular development and survival. However, in its current state, this is all the study provides at this point (i.e. essential apicoplast transporters disrupting apicoplast integrity, and indirectly its major functions, FASII and IPP, as any essential apicoplast protein disruption does). The study fails to deliver further message or function regarding AMC1 and 2, and thus validate their study. Currently, the manuscript just describes how AMC1/2 deletion impacts parasite survival without answering the key question about them: what do they transport? The authors yet have to perform key experiments that would reveal their metabolic function. I would thus recommend the authors work further and determine the function of AMC1 and 2.

      Response: We thank very much the reviewer for his/her positive evaluation of our work. To address the detailed function of the transporters, in the past three months, we have re-constructed plasmids (with codon-optimized DNA sequences of the genes) for expression of the transporters in a regular expression E. coli strain (BL21DE3) and in a pyruvate import knockout E. coli strain (a gift from Prof. Kirsten Jung), to examine the transport capability in vitro. And, we have re-constructed a new plasmid containing a new leading peptide for targeting the pyruvate sensor PyronicSF to the apicoplast in the parasite, to probe the possible substrate pyruvate. However, we were unable to successfully observe expression of the transporters in the above E. coli strains, and we were unable to target the sensor to the correct localization (the apicoplast) in the parasite. As a result, all these efforts have led the study to the current version of manuscript on the functional identification of transporters. We will keep working on this aspect, attempting to dissect out the exact transport function of the transporters in the near future. In this current manuscript, we have discussed the limitations of our study in the last part of the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      Line 35: ...appears to have evolved...

      Line 67: remove first comma

      Line 105: thereafter or therefore?

      Line 130: define ACP

      Line 131: define TMD

      Response: We thank very much the reviewer for the suggestions, and we have revised the points in the current manuscript.

      Figure 1: more information on APT1 would be helpful for readers to interpret the results from turboID e.g., consider showing an illustration showing, according to Karnataki et al 2007 that APT1 likely occupies all 4 membranes of the apicoplast. Also, according to DeRocher et al 2012, APT1 N-term and C-term are both cytosolically exposed, at least in the outermost membrane. The orientation in the other membranes is not known.

      Response: We thank very much the reviewer for the suggestions. We analyzed the localization information of APT1 in T. gondii, based on the studies as the reviewer proposed (Karnataki, et al., 2007; DeRocher et al., 2012). The HA tag at the C-terminus of APT1 was distributed at the four membranes of the apicoplast, indicating that the topology of APT1 might be difficult to be defined at the membranes. Considering this information, we felt hesitant to clearly describe the topology in a schematic diagram about the protein APT1. Nevertheless, the TurboID tagging at the C-terminus of APT1 was an excellent model for identification of potential transporters localized at membranes of the apicoplast. We have put more information about the topology of APT1 in the manuscript, thus providing a better understanding of the proteomic results.

      Figure 2: add a space between "T." and "gondii"

      Figure 2: remove period between "Fitness" and "scores"

      Figure 2: different fonts are used within the figure. Consider using only one font such as arial. Same for Figure 4.

      Figure 2: "Fitness scores" is not bold in panel A but is bold in panel B.

      Response: We thank very much the reviewer for the suggestions. We have revised the points in the current version of the manuscript.

      Line 187: superscript -7

      Line 249: Caution should be used in interpreting two bands as being a precursor and mature product without additional experiments to establish such a relationship. Consider using the term "might" rather than "appear to". The presence of multiple bands could be due to phenomena other than proteolytic processing e.g., alternative splicing, alternative initiator codons, etc.

      Response: We thank very much the reviewer for the suggestions. We have revised the sentences in the current version of manuscript.

      Line 291: define IPP

      Figure 3E. The data points for KD strains appear to be positioned above the zero value on the y-axis. Is this correct?

      Response: We thank very much the reviewer for the suggestions. We have rechecked the figure and replaced it with the correct one.

      Figure 3 G/H legend. Please describe what a single data point represents e.g., the average of one field of view, the average of a certain number of fields of view, or something else? Are the data combined from three experiments or from a representative experiment?

      Response: We thank very much the reviewer for the suggestions. Three independent experiments were performed with at least three replicates. At least 150 vacuoles were scored in each replicate, thus resulting in at least 9 data points in total. The data points were shown with the results from each replicate.

      Line 325: define MEP and explain how it is connected to IPP

      Response: We thank very much the reviewer for the suggestions. We have provided the information in the current version of the manuscript.

      Lines 351-355: The authors refer to Figure 4D to support this statement, but presumably they mean 4E. Also, the authors use the terms C14, C16, and C18. They should more precisely use the terms myristic acid, palmitoleic acid, and trans_oleic acid if this is what they are referring to. Finally, the authors should determine if there is a statistically significant difference between levels of these fatty acids between AMT1 KD and AMT2 KD. If not, they should suggest there is an overall trend toward lower levels of these fatty acids in AMT2 KD parasites compared to AMT1 KD parasites.

      Response: We thank very much the reviewer for the suggestions. We have revised the information in the current version of the manuscript.

      Lines 363-364: The basis of this comment is unclear. Please clarify.

      Lines 369-370: the authors have not shown that the observed lower levels of fatty acids are due to synthesis, as noted above

      Response: We thank very much the reviewer for the suggestions. We have accordingly revised the information in the current version of the manuscript.

      Line 383: Should be Figure S6D

      Line 386: An entire section of the results is used to describe data that are entirely in a supplemental figure. Consider moving this data to a main figure.

      Response: We thank very much the reviewer for the suggestions. We have transferred the data to the main figure in the current version of the manuscript.

      Line 391: Consider using the term virulence instead of growth since now experiments were performed to specifically assess parasite growth in the infected mice.

      Response: We thank very much the reviewer for the suggestions. We have revised the terms in the Results section.

      Line 427: Perhaps the authors mean "...strong growth defect..." or ...strong growth impairment..."

      Line 460-461: This statement is unclear. Please explain how strong backgrounds in proteomics have made it difficult to identify apicoplast transporters. Because they are low abundance? Because they are membrane proteins?

      Response: We thank very much the reviewer for the suggestions. We have revised the corresponding sentences in the current version. The strong backgrounds in the proteomics resulted from the high activity and nonspecific labeling of biotin ligase fused with the apicoplast proteins.

      518-521: It would be helpful for non-specialists if the authors explained how pyruvate is connected to IPP biosynthesis.

      523: delete period after "Escherichia"

      548-549: "We observed similar decreases in level of the MEP biosynthesis activity upon depletion of AMT1 and AMT2..." Reword this since no experiments were done to measure MEP biosynthesis activity.

      Response: We thank very much the reviewer for the suggestions. We have accordingly revised the relevant sentences in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      • The metabolomic data on fatty acid synthesis and isoprenoid levels is relevant but cannot inform about the function of the transporter, since any protein causing loss of the apicoplast would behave in such a manner, i.e. block the apicoplast pathways.

      Response: We thank very much the reviewer for the comment. We agree with this comment. We have thus discussed these points in a subsection in the Discussion, pointing out some of the limitations in the study.

      • Currently, the manuscript fails to directly prove what AMC1 and AMC2 transports, potentially pyruvate as suggested to putatively fuel FASII and MEP/DOXP. Further experimental approaches using exogenous complementation and/or metabolomic analyses using stable isotope labelling (for example) should potentially bring light to the putative functions of AMC1/2.

      Response: We thank very much the reviewer for the comments. As described above, we attempted several approaches to find out the substrates that the AMT1 and AMT2 transports. However, we could not successfully express the proteins in E. coli strains, and we did not generate a T. gondii strain that a pyruvate sensor was properly targeted to the apicoplast. At the end of the Discussion, we have a subsection that discusses the limitations of this study. We hope that our future approaches will be able to tackle these difficulties on the substrate identification.

      Furthermore, the authors have not considered other pathways of interest, like heme or lysophosphatidic acid (LPA)n synthesis, which are two other key pathway, which may be related to AMC1/2 function. Those proposed experiments represent an important body of work, required to bring light to their metabolic functions.

      Response: We thank very much the reviewer for the comments. We thought about that, but we finally decided to mainly discuss two of the pathways that the transporters might participate in, since the transporters contain specific domains on the proteins sequences that potentially are associated with pyruvate.

      Further, the authors might have partially missed some referencing and data about the apicoplast in their introduction (and potentially to address other facets of the apicoplast metabolic functions/capacities in regards to AMC1/2 function): the introduction referencing and explanations are somehow not fully exact/precise for the part of the apicoplast and its pathway: references about the apicoplast, discovery and origin are not citing the original work (that should be Wilson et al. 1996, McFadden et al. 1996, Kohler et al. 1997,), same for the discovery of FASII and MEP./DOXP (Waller 1998, Jomaa et al...). The introduction (and the study?) lacks information about other key functions of the apicoplast: heme synthesis, lysophosphatidic acid synthesis (using FASII products). The explanations about the roles of FASII/DOXP are partial and not fully citing important references: Krishnan et al. 2020, and Amiar et al. 2020 are also key to understanding how the role of FASII is metabolically flexible depending on nutrient content. A whole part on the fact that FASII is not only dispensible but can also become essential under metabolic adaptations conditions, are missing (Botté et al. 2013, Amiar et al. 2020, Primo et al. 2021). These novel important facets of parasite biology should be mentioned as well as directly linked to the author's topic. This is more minor but could bring new ideas to the authors.

      Response: We thank very much the reviewer for the suggestions. We have revised the relevant part in the introduction.

      We are grateful for the suggestions to improve the manuscript.

    2. eLife assessment

      This study identifies two new transporters in the apicoplast, a non-photosynthetic organelle of apicomplexan parasites. While this is important work, it only partially reveals how essential these transporters are, as it does not address the metabolic function of the transporters for the parasite. Although the evidence is still incomplete, the results should be of interest to parasitologists and eukaryotic cell biologists.

    3. Reviewer #1 (Public Review):

      The apicoplast, a non-photosynthetic vestigial chloroplast, is a key metabolic organelle for the synthesis of certain lipids in apicomplexan parasites. Although it is clear metabolite exchange between the parasite cytosol and the apicoplast must occur, very few transporters associated with the apicoplast have been identified. The current study combines data from previous studies with new data from biotin proximity labeling to identify new apicoplast resident proteins including two putative monocarboxylate transporters termed MCT1 and MCT2. The authors conduct a thorough molecular phylogenetic analysis of the newly identified apicoplast proteins and they provide compelling evidence that MCT1 and MCT2 are necessary for normal growth and plaque formation in vitro along with maintenance of the apicoplast itself. They also provide indirect evidence for a possible need for these transporters in isoprenoid biosynthesis and fatty acid biosynthesis within the apicoplast. Finally, mouse infection experiments suggest that MCT1 and MCT2 are required for normal virulence, with MCT2 completely lacking at the administered dose. Overall, this study is generally of high quality, includes extensive quantitative data, and significantly advances the field by identifying several novel apicoplast proteins together with establishing a critical role for two putative transporters in the parasite. The study, however, could be further strengthened by addressing the following aspects:

      Main comments:

      1. The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of IPP is only supported by indirect measurements (effects on host GFP uptake or trafficking, possibly due to effects on IPP dependent proteins such as rabs, and mitochondrial membrane potential, possibly due to effects on IPP dependent ubiquinone). This conclusion would be more strongly supported by directly measuring levels of IPP. If their or technical limitations that prevent direct measurement of IPP then the author should note such limitations and acknowledge in the discussion that the conclusion is based on indirect evidence.

      2. The conclusion that condition depletion of AMT1 and/or AMT2 affects apicoplast synthesis of fatty acids is also poorly supported by the data. The authors do not distinguish between the lower fatty acid levels being due to reduced synthesis of fatty acids, reduced salvage of host fatty acids, or both. Indeed, the authors provide evidence that parasite endocytosis of GFP is dependent on AMT1 and AMT2. Host GFP likely enters the parasite within a membrane bound vesicle derived from the PVM. The PVM is known to harbor host-derived lipids. Hence, it is possible that some of the decrease in fatty acid levels could be due to reduced lipid salvage from the host. Experiments should be conducted to measure the synthesis and salvage of fatty acids (e.g., by metabolic flux analysis), or the authors should acknowledge that both could be affected.

    4. Reviewer #2 (Public Review):

      In this study Hui Dong et al. identified and characterized two transporters of the monocarboxylate family, which they called Apcimplexan monocarboxylate 1 and 2 (AMC1/2) that the authors suggest are involved in the trafficking of metabolites in the non-photosynthetic plastid (apicoplast) of Toxoplasma gondii (the parasitic agent of human toxoplasmosis) to maintain parasite survival. To do so they first identified novel apicoplast transporters by conducting proximity-dependent protein labeling (TurboID), using the sole known apicoplast transporter (TgAPT) as a bait. They chose two out of the three MFS transporters identified by their screen based and protein sequence similarity and confirmed apicoplast localisation. They generated inducible knock down parasite strains for both AMC1 and AMC2, and confirmed that both transporters are essential for parasite intracellular survival, replication, and for the proper activity of key apicoplast pathways requiring pyruvate as carbon sources (FASII and MEP/DOXP). Then they show that deletion of each protein induces a loss of the apicoplast, more marked for AMC2 and affects its morphology both at its four surrounding membranes level and accumulation of material in the apicoplast stroma. The authors attempted to decipher the function of the transporters on metabolic functions of the apicoplast: (a) notably for IPP synthesis through the assessment of vesicle import allowed by IPP-based anchors, which was found to be affected in the mutants, as well as (b) apicoplast fatty acid synthesis by indirect assessment of vesicle import. However, none of them directly concluded on the actual function of the transporters. Furthermore heterologous complementation in bacterial system also failed to demonstrate the transporters' function.

      However, this study is very timely, as the apicoplast holds several important metabolic functions (FASII, IPP, LPA, Heme, Fe-S clusters...), which have been revealed and studied in depth but no further respective transporter have been identified thus far. hence, new studies that could reveal how the apicoplast can acquire and deliver all the key metabolites it deals with, will have strong impact for the parasitology community as well as for the plastid evolution communities. The current study is well initiated with appropriate approaches to identify two new putatively important apicoplast transporters, and showing how essential those are for parasite intracellular development and survival. However, in its current state, this is all the study provides at this point (i.e. essential apicoplast transporters disrupting apicoplast integrity, and indirectly its major functions, FASII and IPP, as any essential apicoplast protein disruption does). The study fails to deliver further message or function regarding AMC1 and 2, and thus validate their study. Currently the manuscript just describes how AMC1/2 deletion impacts parasite survival without answering the key question about them: what do they transport. The authors yet have to perform key experiments that would reveal their metabolic function. Ideally the authors would work further and determine the function of AMC1 and 2.

    1. eLife assessment

      Drawing on a human population genomic data set, this valuable study seeks to show that potentially advantageous alleles are on average older than neutral alleles, invoking the action of balancing selection as the underlying explanation. Currently it is unfortunately unclear how robust the estimates of allele ages are, and the evidence for the authors' proposal is therefore at this stage incomplete. If confirmed, the conclusions would be of interest to population genomicists, especially those studying humans.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this study, the authors attempt to reinvestigate an old question in population genetics regarding the age of alleles that have experienced different strengths (and directions) of natural selection. Under simple population genetic models, alleles that are positively selected are expected to change frequency in populations faster than neutral alleles. So the naïve expectation is that if you look at alleles that are the same population frequency, those that have been evolving neutrally should have been segregating in the population longer than those that have been experiencing natural selection. While this is exactly what the authors find for alleles inferred to be experiencing negative selection (i.e. they tend to be younger than alleles inferred to be neutral that are at the same frequency), the authors find the opposite for alleles inferred to be under positive selection: they tend to be older than alleles inferred to be neutral. The authors argue that this pattern can be explained by a model where positively selected mutations experience a phase of balancing selection that can dramatically extend the period of time that these alleles segregate in the population.

      Strengths:<br /> The question that the authors address is very interesting and thought provoking. When confronted with a counter-intuitive finding, the authors describe an interesting hypothesis to explain it. The authors investigate a number of interesting sub analyses to corroborate their findings.

      Weaknesses:<br /> While there are some intriguing hypotheses in this manuscript, I struggle to be convinced. The main point that the authors argue is that positively selected alleles are older than their neutral counterparts at the same frequency. They argue that this may be because the positively selected alleles are stuck in some form of balancing selection for a long time before they switch to a more classical form of directional selection. The form of balancing selection they argue is one caused by linkage to deleterious alleles, which takes time for the beneficial alleles to recombine onto a more neutral background. I would really like to see some simulations that demonstrate this can actually occur on average. Reading this paper brought back memories of the classic Birky and Walsh (1988; PMCID: PMC281982) paper that argued that linkage amongst selected alleles does not impact the substitution rate of linked neutral alleles, but does reduce the substitution rate among beneficial alleles. Their simple simulations in 1988 illuminated how this works, and they developed a simple mathematical model that helped us understand how it works. In the current paper, it seems the authors are arguing for a similar effect, but rather than focus on beneficial alleles that fix, they are focusing on beneficial alleles that are still segregating. These seem like similar stories, but without simulations or a mathematical model, I struggle to gain any insight into why the observation is the way it is (and not simply due to a number of possible confounding effects noted below).<br /> There are a number of elements to the methods and interpretation that could use clarification.<br /> • Genetic data. One of the biggest weaknesses of this analysis is the choice of genetic data. The authors use the UK10k dataset, and reference the 2015 paper. Looking at that paper, it seems that the data may be composed of low coverage whole genome sequencing data (7x) and high coverage exome sequence data (80x). It appears that these data were integrated into a single VCF file, similar to the 1000 Genomes Project Phase 3 data. If these are the data that was used, then there are substantial differences between the coding and non-coding variants that are compared. However, it is possible that the authors chose to restrict the analysis to the low coverage WGS data and neglected to indicate it in the methods section. I will assume that this is the case for the rest of the review, but the authors should clarify.<br /> • Recombination rates. I believe the authors use an LD-based recombination map. While these maps are correlated at the longer physical distances with pedigree maps, there are substantial differences at shorter physical scales. These differences have been argued to be due to the action of natural selection skewing patterns of LD. If that is the case, then some of the observations in this paper are circular. Please confirm similar findings with a pedigree-based recombination map.<br /> • Recombination rates, pt 2. The authors compare patterns of non-synonymous coding variants to a set of non-coding, non-regulatory SNPs. They argue "these will necessarily have experienced similar mutational and recombinational processes". I don't know that this is true. There are both distinct recombination patterns and mutational patterns in genes vs non-coding regions of the genome. It would be important to more carefully match coding and non-coding variants based on both recombination as well as the type of nucleotide change. There are substantial differences in CpG composition in coding vs non-coding regions for example. While the authors say "Analyses thought to be sensitive to CpG high mutability were limited to SNPs that did not occur as part of a CpG", it is quite unclear what where CpGs were included vs excluded.<br /> • Identifying ancestral vs derived alleles. It is unclear how the authors identified ancestral vs derived alleles (they say "inferred ancestral sequence from Ensembl (1) and a maximum likelihood estimator". Several studies have shown that ancestral misidentification can cause skews in the site frequency spectrum. If the ancestral state of some fraction of alleles were misidentified, then the estimated allele age would be incorrect. Figure 1B shows that the mean frequency of the alleles with the largest delta-EP tend to be very low. This makes me think that ancestral misidentification may have impacted the results.<br /> • Figure 2B and C. I do not understand how the median can be so far outside the mean and error bars. The legend does not specify what the error bars are, but I feel the distribution must be shown if it is so skewed that the mean and any definition of error does not include the median.<br /> • Inferring allele ages. The authors use two methods for estimating allele ages, but focus on GEVA. They use the default parameter of effective population size 10,000. How sensitive is the model to this assumption? It has been shown that different regions of the genome (particularly coding vs neutral non-coding) experience different rates of deleterious mutations, and therefore different rates of background selection. Simple models of background selection would suggest that these regions will therefore have different effective population sizes.<br /> • Fst analysis. The authors look at Fst among 3 populations as a function of delta-EP compared to frequency-matched control SNPs. They find there is no statistical support for different levels of Fst in any pairwise comparison for any delta-EP bin. It seems strange that alleles with large delta-EP would not show increased Fst compared to control SNPs... If they are indeed positively selected, the assumption must be that they are then positively selected in all populations, which seems unlikely. Alternatively, by considering only narrow allele frequency bins, it is possible that Fst is also being controlled, and therefore this analysis is non-informative. A simulation would help understand what the expected pattern is here.<br /> • It would be great to show more figures like 2A. You can place the x-axis on a log-scale so that it is easier to view the lower allele frequencies. This plot clearly shows differences among the 3 categories. I am very surprised at the much shorter error bars for negative delta-EP at high frequency compared to positive delta-EP variants... Shouldn't there be very few negative delta-EP alleles at such high frequency?

    3. Reviewer #2 (Public Review):

      The authors provide an analysis showing that the allele ages of putatively advantageous alleles tend to be older than those of neutral alleles. To do this, the authors first classify mutations as either neutral, advantageous or deleterious based on a metric called the 'evolutionary probability' which is correlated to the impact of selection acting on a mutation. Then, the authors quantify the age of the mutations using the GEVA method and they also quantify tc (the time of the ancestral node of the edge carrying the mutation). Interestingly, the authors find that advantageous mutations tend to have an older allele age and an older value of tc compared to neutral mutations. The authors posit some explanations for this result invoking the action of balancing selection.

      This is an interesting paper and its results could merit an important change in our conception of how we believe that natural selection is acting on the human genome. I have concerns about some of the analysis presented on this paper that have to do with two main factors: 1) Showing that the estimates of allele ages and tc are robust on the dataset presented (more on this topic here below). 2) Presenting more simulations or analytical theory where the authors can show that the models presented by the authors to explain the results indeed fit the data well. As an example, the authors could perform some simulations (likely using SLiM) under the balancing selection models posited by the authors and then show that they can produce data where the allele ages for deleterious, neutral and advantageous alleles have similar patterns to what is observed on the genomic dataset analyzed.

      Major concerns

      - What is the impact of multiple mutations on the same site on the estimates of allele ages with GEVA?

      - GEVA, which is one of the methods used by the authors, 'overestimates "intermediate" times and underestimates older times' according to Ragsdale and Thornton (2023) MBE. What is the impact of this effect for the analysis performed by the authors? Do RUNTC has any known biases on their estimate of tc?

      - Additionally what is the impact of phasing errors on the estimates of allele age presented by the authors?

    4. Reviewer #3 (Public Review):

      In their manuscript, Pivirotto et al. make an unexpected observation that a set of candidate beneficial alleles according to the Evolutionary Probability method (EP) have estimated ages thousands of years older than control alleles of similar frequency and outside of functional segments. To explain this unexpectedly older ages, the authors propose a number of interesting evolutionary processes related to balancing selection, including staggered sweeps.

      It is important to first mention that the authors do find that as expected, deleterious alleles are younger than controls. This provides evidence that the allele age estimates used by the authors are of sufficient quality to detect age differences between groups of genes. I am also convinced by the fact that EP can be used to focus on a set of alleles substantially enriched in deleterious ones, given the very clear frequency patterns related to EP.

      I have a number of concerns about the manuscript, including one rather serious one.

      My main concern is that many of the observations made by the authors could be caused by mispolarization of alleles, where either (i) mostly low frequency derived alleles are mischaracterized as ancestral and the other, actually ancestral allele is mischaracterized as a high frequency derived allele, or (ii) mostly low frequency ancestral alleles are mischaracterized as derived. Unfortunately, the authors do not even mention the risk of mispolarization in their manuscript. This is a serious problem for this manuscript because ancestral alleles annotated as derived are by definition going to generate older age estimates than if they were truly derived. It would be very useful to be able to have a look at the full distribution of allele ages rather than just confidence intervals as in Figure 1. I happen to have experience with mispolarization of high frequency ancestral alleles as derived by a maximum likelihood method, different from the one used by the authors (Keightley et al Genetics 2018), where the mispolarization became visible as a very suspicious SFS with a visible excess of high frequency variants, especially those expected to be functional (because of the relatively larger corresponding supply of low frequency deleterious functional variants). Even if the ML method used by the authors is not the same, mispolarization is still a serious risk. Glémin et al. Genome Research 2015 also found that mispolarization is far from being a negligible issue.

      Mispolarization of low frequency alleles may be especially prominent in the case of mispolarized deleterious alleles associated with a very negative delta-EP, that then appear as alleles with a very positive delta-EP. Focusing on high delta-EP alleles may then in fact enrich the dataset in mispolarized alleles that then result in older age estimates. Looking at Figure 1B especially, I am worried by the fact that very high delta-EP values seem to go back to the frequencies observed for very negative delta-EP. This is what mispolarization of low frequency alleles might cause as a pattern, in this case especially low frequency ancestral alleles being misidentified as derived?

      The authors can address the possible issue of mispolarization in multiple ways. First, they can use simulations of sequences to estimate amounts of mispolarization based on their polarization approach, using substitutions/mutation rates as realistic as possible.<br /> Second, the authors could check if there is suspicious symmetry in the distribution of delta-EP between alleles at frequency f and alleles at frequency 1-f. This pattern could be generated by mispolarization.

      My second less serious concern has to do with the use of high delta-EP as evidence that alleles are beneficial. The validation set from the Patel & Kumar 2019 paper is arguably small with 24 known selected variants. It does not follow from the fact that a small set of known selected variants have higher delta-EP, that all variants with high delta-EP tend to be beneficial. This is especially true in the case where beneficial variants tend to be rare, and there are then far more variants expected with high delta-EP than there are beneficial variants. I am willing to change my mind on this if the overall results can be shown to be robust after accounting for allele mispolarization.

      Third, I like the idea of staggered sweeps to explain the results, but I am wondering if there is any evidence in the literature of interference between deleterious and advantageous variants that the authors could base their proposed explanation on.

      Finally, and I realize that it is a bit of a stretch, I am wondering if the authors could better justify their choices of methods to estimate the age of alleles. What about ARGweaver, Relate or tsdate? How do these methods compare with GEVA? From looking at the literature I could not find a direct comparison of the precision of GEVA compared to these other tools, but it may be worth at least discussing that the results could be further put to the test with other available ARG-based tools to estimate allele ages. Wilder Wohns et al. Science 2022 compare the performance of these different ARG methods with ancient DNA data, and in fact find that GEVA does not perform as well as for example Relate or tsdate.

    1. eLife assessment

      This study presents an important conceptual advance of how vitamin A and its derivatives contribute to atherosclerosis. There is solid evidence for the contributions of specialized populations of T cells in atherosclerosis resolution, including use of multiple in vivo models to validate the functional effects. A limitation is the insufficient analysis of lesions, but the manuscript has been improved from the original preprint version and the overarching conclusions have been refined.

    2. Author Response

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study presents a valuable conceptual advance of how Vitamin A and its derivatives contribute to atherosclerosis. There is solid evidence invoking the contributions of specialized populations of T cells in atherosclerosis resolution, including use of multiple in vivo models to validate the functional effect. The significance of the study would be strengthened with more detailed interrogation of lesions composition and consolidation with previous work on the topic from human studies.

      Answer: We thank the reviewers and editorial office for their comments and constructive criticism. Below we provide point by point responses to the comments and concerns, which include the issues of lesion composition and consolidation with human studies. We also proofread the manuscript and included information about the immunostaining procedures that were previously missing (Lines 199 – 206).

      Public Reviews

      REVIEWER #1:

      This is an interesting study by Pinos and colleagues that examines the effect of beta carotene on atherosclerosis regression. The authors have previously shown that beta carotene reduces atherosclerosis progress and hepatic lipid metabolism, and now they seek to extend these findings by feeding mice a diet with excess beta carotene in a model of atherosclerosis regression (LDLR antisense oligo plus Western diet followed by LDLR sense oligo and chow diet). They show some metrics of lesion regression are increased upon beta carotene feeding (collagen content) while others remain equal to normal chow diet (macrophage content and lesion size). These effects are lost when beta carotene oxidase (BCO) is deleted. The study adds to the existing literature that beta carotene protects from atherosclerosis in general, and adds new information regarding regulatory T-cells. However, the study does not present significant evidence about how beta-carotene is affecting T-cells in atherosclerosis. For the most part, the conclusions are supported by the data presented, and the work is completed in multiple models, supporting its robustness. However there are a few areas that require additional information or evidence to support their conclusions and/or to align with the previously published work.

      Specific additional areas of focus for the authors:

      1. The premise of the story is that b-carotene is converted into retinoic acid, which acts as a ligand of the RAR transcription factor in T-regs. The authors measure hepatic markers of retinoic acid signaling (retinyl esters, Cyp26a1 expression) but none of these are measured in the lesion, which calls into question the conclusion that Tregs in the lesion are responsible for the regression observed with b-carotene supplementation.

      Answer: We agree with the Reviewer’s comment, which prompted us to quantify the expression of the retinoic acid-sensitive maker Cyp26b1 in the atherosclerotic lesions. Cyp26b1, together with Cyp26a1 and c1, contain retinoic acid response elements (RAREs) in their promoter, and therefore, are highly sensitive to retinoic acid. Indeed, the mRNA/protein expression of Cyp26s are widely considered surrogate markers for retinoic acid levels in cells or tissues.

      We typically use Cyp26a1 as a surrogate marker for retinoic acid signaling in the adipose tissue and the liver, as we did in this study. However, our RNA seq data in murine bone-marrow derived macrophages (mBMDMs) exposed to retinoic acid revealed that Cyp26b1 is the only Cyp26 family member responsive to retinoic acid (PMID: 36754230). Actually, Cyp26a1 or c1 were not expressed in our mBMDMs (data not shown). Unlike the M2 marker arginase 1, Cyp26b1 did not respond to IL-4 (Figure iA). Hence, Cyp26b1 is an adequate marker to evaluate retinoic acid signaling in the lesion of mice, rich in macrophages.

      Before staining the lesions, we validated the Cyp26b1 antibody by staining mBMDMs exposed to retinoic acid (Figure iB).

      Author response image 1.

      (A) mBMDMs were divided in M0 or M2 (exposed to IL-4 for 24 h), and then treated with either DMSO or retinoic acid for 6 h before harvesting for RNA seq analysis. Exploring the RNA seq dataset, we identified Cyp26b1 as a RA-sensitive gene in mBMDMs (PMID: 36754230). (B) Validation of Cyp26b1 antibody in mBMDMs exposed to retinoic acid confirms the suitability of this antibody for measuring retinoic acid signaling in our experimental settings.

      In the current version of the manuscript, we include the results of Cyp26b1 quantifications (Figure 5H, I), (Lines: 362 - 366). To put these findings in perspective to human studies, we discuss these results with the role human CYP26B1 plays in the atherosclerotic lesion (Lines: 450 - 464).

      1. There does not appear to be a strong effect of Tregs on the b-carotene induced pro-regression phenotype presented in Figure 5. The only major CD25+ cell dependent b-carotene effect is on collagen content, which matches with the findings in Figure 1 +2. This mechanistically might be very interesting and novel, yet the authors do not investigate this further or add any additional detail regarding this observation. This would greatly strengthen the study and the novelty of the findings overall as it relates to b-carotene and atherosclerosis.

      Answer: As the Reviewer points out, the effects of β-carotene on collagen content are more pronounced than those on CD68 content in the lesion. Indeed, we have observed the majority of the experiments in this manuscript.

      Collagen accumulation in the lesion is a complex process, where smooth muscle cells secrete collagen and plaque macrophages (typically) degrade it. Matrix metalloproteases produced by macrophages contribute to the degradation of collagen, and studies show that retinoic acid regulates the expression of metalloproteinases in various cell types (PMID: 2324527, 24008270). We explored the expression of metalloproteases in macrophages exposed to retinoic acid in our mBMDM RNA seq, but we did not observe any significant result (data not shown).

      Interestingly, M2 macrophages can secrete collagen by upregulating arginase 1 expression. In the current version of the manuscript, we acknowledge this in the results (Lines: 358-359) and in the discussion section (Lines: 443-449).

      1. The title indicates that beta-carotene induces Treg 'expansion' in the lesion, but this is not measured in the study.

      Answer: Following the suggestion by the Reviewer, we have re-worded the title to “β-carotene accelerates the resolution of atherosclerosis in mice”

      REVIEWER #2:

      Pinos et al present five atherosclerosis studies in mice to investigate the impact of dietary supplementation with b-carotene on plaque remodeling during resolution. The authors use either LDLR-ko mice or WT mice injected with ASO-LDLR to establish diet-induced hyperlipidemia and promote atherogenesis during 16 weeks, and then they promote resolution by switching the mice for 3 weeks to a regular chow, either deficient or supplemented with b-carotene. Supplementation was successful, as measured by hepatic accumulation of retinyl esters. As expected, chow diet led to reduced hyperlipidemia, and plaque remodeling (both reduced CD68+ macs and increased collagen contents) without actual changes in plaque size. But, b-carotene supplementation resulted in further increased collagen contents and, importantly, a large increase in plaque regulatory T-cells (TREG). This accumulation of TREG is specific to the plaque, as it was not observed in blood or spleen. The authors propose that the anti-inflammatory properties of these TREG explain the atheroprotective effect of b-carotene, and found that treatment with anti-CD25 antibodies (to induce systemic depletion of TREG) prevents b-carotene-stimulated increase in plaque collagen and TREG.

      1. An obvious strength is the use of two different mouse models of atherogenesis, as well as genetic and interventional approaches. The analyses of aortic root plaque size and contents are rigorous and included both male and female mice (although the data was not segregated by sex). Unfortunately, the authors did not provide data on lesions in en face preparations of the whole aorta.

      Answer: We appreciate the positive comments on rigor. We considered displaying our data segregated by sex, although for some experiments, we did not have matching numbers of male and female mice, which could be distracting for the reader. The goal of our study was to analyze changes in plaque composition. Therefore, our experimental approach was designed to study atherosclerosis resolution (plaque composition changes, but not plaque size) instead of atherosclerosis regression (both plaque composition and size change). As expected, we did not observe differences in plaque size at the level of the atherosclerotic root for any of our experiments, which deterred us from quantifying plaque content by en-face in the aorta.

      2.Overall, the conclusion that dietary supplementation with b-carotene may be atheroprotective via induction of TREG is reasonably supported by the evidence presented. Other conclusions put forth by the authors (e.g., that vitamin A production favors TREG production or that BCO1 deficiency reduces plasma cholesterol), however, will need further experimental evidence to be substantiated.

      Answer: We apologize for the lack of clarity in the presentation of our results and overstating our conclusions. We have rephrased some of these conclusions in the results and discussion sections.

      3.The authors claim that b-carotene reduces blood cholesterol, but data shown herein show no differences in plasma lipids between mice fed b-carotene-deficient and -supplemented diets (Figs. 1B, 2A, and S3A).

      Answer: As Reviewer 2 points out, we did not observe changes in plasma cholesterol between mice undergoing Resolution in response to β-carotene. For clarity, we rephrased our plasma lipids results for each of our experimental designs (Lines: 230 – 236, 270 – 272, and 288-290). We also include a clarification in the discussion section about the differential effects of β-carotene on plasma lipids when mice undergo atherosclerosis progression and resolution. (Lines: 419 - 430).

      1. Also, the authors present no experimental data to support the idea that BCO1 activity favors plaque TREG expansion (e.g., no TREG data in Fig 3 using Bco1-ko mice).

      Answer: We appreciate the suggestion by the Reviewer 2. In the current version of the manuscript, we stained the aortic roots from Bco1-/- mice for FoxP3. We did not observe differences between Control and β-carotene resolution groups, in agreement with the results in plaque composition (CD68 and collagen contents). These new data strengthen our manuscript and now we included these results as a Supplementary Figure 3D, E. (Lines: 465 - 471).

      5.As the authors show, the treatment with anti-CD25 resulted in only partial suppression of TREG levels. Because CD25 is also expressed in some subpopulation of effector T-cells, this could potentially cloud the interpretation of the results. Data in Fig 4H showing loss of b-carotene-stimulated increase in numbers of FoxP3+GFP+ cells in the plaque should be taken cautiously, as they come from a small number of mice. Perhaps an orthogonal approach using FoxP3-DTR mice could have produced a more robust loss of TREG and further confirmation that the loss of plaque remodeling is indeed due to loss of TREG.

      Answer: We agree with the reviewer, and we rephrased the results and discussion to avoid overstating our findings. We now acknowledge a second experimental approach would help us confirm our findings employing a blocking antibody targeting CD25. We favored the use of anti-CD25 infusions over other depletion methods based on the experimental protocol carried out by our collaborators in which the examined the effect of Tregs on atherosclerosis regression (PMID: 32336197). The utilization of FoxP3-DTR mice would nicely complement our findings. In the current version of the manuscript, we discuss this alternative approach (Line : 491 - 501).

      Recommendations for the Authors

      All reviewers agreed that despite the claims of the title, there is no direct interrogation of Tregs or vitamin A signaling in lesions.

      The work does not consolidate well with the role of B-carotene in human heart disease. Additional discussion and synthesis are required to elaborate on the significance of the findings. For example, the idea of beta carotene supplementation for cardiovascular prevention has attracted attention for years but recent meta-analysis showed no benefit, and, if anything, an increase in cardiovascular events. The U.S. Preventive Services Task Force (USPSTF) went as far to recommend AGAINST the use of beta-carotene for the prevention of cardiovascular disease.

      In light of the above point and elife editorial policies, please revise the title to include species.

      Answer: Thanks for your feedback. Carotenoid metabolism in mammals is complex, and establishing direct parallelisms between humans and rodents must be done with caution. For example, β-carotene supplementation in humans inevitably results in the accumulation of this compound in plasma, while in rodents, β-carotene is quickly metabolized to vitamin A. Our findings over the years reveal that the effects of β-carotene in mice derive exclusively from its role as vitamin A precursor.

      In the current study, we confirm our previous work utilizing Bco1-/- mice, which are unable to produce vitamin A when fed β-carotene. Then, we observe that vitamin A promotes atherosclerosis resolution in mice independently of alterations in plasma cholesterol in two independent mouse models. Lastly, we utilized anti-CD25 blocking antibodies to deplete Tregs to establish a direct connection between dietary β-carotene/vitamin A and Tregs in the lesion. While this experimental approach failed to completely deplete Tregs, our morphometric assays indicates that these infusions were sufficient to partially mitigate the effect of β-carotene on atherosclerosis resolution.

      Regardless, in the discussion section of our manuscript, we attempt to consolidate our preclinical studies with clinical data (Lines: 374 – 376, and 461 – 464).

      We have also revised the title, as suggested by Reviewer 1. We also included “mice” in the title to align with the editorial policies of eLife.

      Reviewer #1:

      1.1. The authors need to measure retinoic acid signaling directly in the lesion and in Tregs to be able to draw the conclusion that b-carotene is directly activating Tregs to promote regression.

      Answer: Please see comments above.

      1.2. The authors to investigate the role of beta carotene on collagen production by T-regs.

      Answer: Please see comments above.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      2.1. If the authors still have frozen sections of the aortas from their Bco1-ko experiment, it should be trivial to look at plaque TREG contents to confirm that vitamin A production is indeed needed for the effect of b-carotene on plaque remodeling.

      Answer: Please see comments above.

      Minor:

      2.2. This reviewer wonders if the axis for lesion size in all figures is off by an order of magnitude. Most studies show aortic root lesions in the 10^5 um2 range, not in the 10^6 um2.

      Answer: We apologize for this error. We have corrected the units in all our quantifications.

      2.3. FPLC lipoprotein profiles would enhance the manuscript.

      Answer: We have run FPLCs for the plasmas and included them in the results (Lines: 233 – 236). Data are presented in Figure 1C, D.

      2.4.This reviewer could not cope with the thought that mice that are fed 16+ weeks a diet that is vitamin A-deficient did not become vit A-deficient (e.g., Fig. 1E). Perhaps the authors could elaborate a little on this in their discussion.

      Answer: Mice are extremely resistant to vitamin A deficiency. A common protocol to achieve deficiency in mice requires feeding a vitamin A deficient diet to dams during their pregnancy and lactation to deplete new-born pups of vitamin A stores. Even in that situation, pups display enough vitamin A stores to sustain circulating vitamin A levels to those observed in wild-type mice. In the current version of the manuscript, we have included a paragraph in the discussion to cover this “interesting” aspect. (Lines: 476 – 483).

    1. eLife assessment

      This important study combines state-of-the art proteomics and genetic manipulation of Chlamydia trachomatis to study the function of a chlamydial effector, Cdu1, with deubiquitination and acetylation activities. Solid evidence is provided to show that Cdu1 is able to protect itself and three other chlamydial effectors, which are involved in the control of chlamydial egress from host cells, from ubiquitin-mediated degradation, and that this depends on the acetylation activity of Cdu1, but not on its deubiquitination activity. This work will be of interest to microbiologists and cell biologists studying host cell-pathogen interactions.

    1. Author Response

      Reviewer #1 (Public Review):

      Summary:

      The evolution of transporter specificity is currently unclear. Did solute carrier systems evolve independently in response to a cellular need to transport a specific metabolite in combination with a specific ion or counter metabolite, or did they evolve specificity from an ancestral protein that could transport and counter-transport most metabolites? The present study addresses this question by applying selective pressure to Saccharomyces cerevisiae and studying the mutational landscape of two well-characterised amino acid transporters. The data suggest that AA transporters likely evolved from an ancestral transporter and then specific sub-families evolved specificity depending on specific evolutionary pressure.

      Strengths:

      The work is based on sound logic and the experimental methodology is well thought through. The data appear accurate, and where ambiguity is observed (as in the case of citruline uptake by AGP1), in vitro transport assays are carried out to verify transport function.

      Weaknesses:

      Although the data and findings are well described, the study lacked additional contextual information that would support a clear take-home message.

      We appreciate the reviewer’s positive assessment of the work, and the helpful comment to summarize the findings into a short take-home message. We chose not to discuss protein evolution theories in detail to keep the text as concise as possible. However, we do acknowledge the fact that the reader might want to see our results embedded in more context. In a revised version, we will integrate our findings more with the pertinent literature, which will show how our results align with theoretical models for protein evolution towards novel functions. We will also discuss in more detail how our laboratory results could be translated into a “natural” setting of evolution.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes evolution experiments performed on yeast amino acid transporters aiming at the enlargement of the substrate range of these proteins. Yeast cells lacking 10 endogenous amino acid transporters and thus being strongly impaired to feed on amino acids were again complemented with amino acid transporters from yeast and grown on media with amino acids as the sole nitrogen source.

      In the first set of experiments, complementation was done with seven different yeast amino acid transporters, followed by measuring growth rates. Despite most of them have been described before in other experimental contexts, the authors could show that many of them have a broader substrate range than initially thought.

      Moving to the evolution experiments, the authors used the OrthoRep system to perform random mutagenesis of the transporter gene while it is actively expressed in yeast. The evolution experiments were conducted such that the medium would allow for poor/slow growth of cells expressing the wt transporters, but much better/faster growth if the amino acid transporter would mutate to efficiently take up a poorly transported (as in the case of citrulline and AGP1) or non-transported (as in case of Asp/Glu and PUT4) amino acid.

      This way and using Sanger sequencing of plasmids isolated from faster-growing clones, the authors identified a number of mutations that were repeatedly present in biological replicates. When these mutations were re-introduced into the transporter using site-directed mutagenesis, faster growth on the said amino acids was confirmed. Growth phenotype data were attempted to be confirmed by uptake experiments using radioactive amino acids; however, the radioactive uptake data and growth-dependent analyses do not fully match, hinting at the existence of further parameters than only amino acid uptake alone to impact the growth rates.

      When mapped to Alphafold prediction models on the transporters, the mutations mapped to the substrate permeation site, which suggests that the changes allow for more favourable molecular interactions with the newly transported amino acids.

      Finally, the authors compared the growth rates of the evolved transporter variants with those of the wt transporter and found that some variants exhibit a somewhat diminished capacity to transport its original range of amino acids, while other variants were as fit as the wt transporter in terms of uptake of its original range of amino acids.

      Based on these findings, the authors conclude that transporters can evolve novel substrates through generalist intermediates, either by increasing a weak activity or by establishing a new one.

      Strengths:

      The study provides evidence in favour of an evolutionary model, wherein a transporter can "learn" to translocate novel substrates without "forgetting" what it used to transport before. This evolutionary concept has been proposed for enzymes before, and this study shows that it also can be applied to transporters. The concept behind the study is easy to understand, i.e. improving growth by uptake of more amino acids as nitrogen source. In addition, the study contains a large and extensive characterization of the transporter variants, including growth assays and radioactive uptake measurements.

      Weaknesses:

      The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. While this has worked out for two transporters/substrate combinations, I wonder how comprehensive and general the insights are. In such approaches, it is difficult to know which mutation space is finally covered/tested. And information that can be gained from loss-of-function analyses is missed. The entire conclusions are grounded on a handful of variants analyzed. Accordingly, the outcome is somewhat anecdotal; in some cases, the fitness of the variants was changed and in others not. Highlighting the amino acid changes in the context of the structural models is interesting, but does not fully explain why the variants exhibit changed substrate ranges. Two important technical elements have not been studied in detail by the authors, but may well play a certain role in the interpretation of the results. Firstly, the authors did not quantify the amount of transporter being present on the cell surface; altered surface expression can impact uptake rates and thus growth rates. Secondly, the authors have not assessed whether overexpressing wt versus variant transporters has an impact on the growth rate per se. Overexpressing transporters from plasmids is quite a burden for the cells and often impacts growth rates. Variants may be more or less of a burden, an effect that may (or may also not) go hand in hand with increased/decreased surface production levels.

      And finally, I was somewhat missing an evolutionary analysis of these transporters to gain insights into whether the identified substitutions also occurred during natural evolution under real-life conditions.

      First of all, we thank the reviewer for the attention to detail with which they have read the manuscript, and the very helpful comments on how to improve it. We will indeed take on some of the suggestions in a revised version of the text:

      Regarding the match of growth rate and uptake rate measurements, we plan to plot their correlation in a graph.

      Regarding the amount of transporter on the plasma membrane, we acknowledge that the visual representation of the fluorescence micrographs already in the text might not be enough. We therefore will quantify expression levels from said micrographs and include the information in the manuscript.

      On a similar note, we had already measured the growth rates of all transporter variant cultures in the absence of selection for amino acid uptake (i.e., in medium with ammonium as the nitrogen source; Figure 4 - Supplement figure 1). We will include the measured growth rates in the text to give an indication of what the impact of transporter overexpression is on the growth rate per se.

      Regarding the proposed analysis of natural transporter sequences, we do see the possible value in such an analysis. However, it is currently out of scope for the present study. The reasons are 1) that preliminary analyses show that the sequence similarity of functionally verified/annotated transporters is too low to reliably pinpoint a phenotype to a single residue, and 2) that we do not envision that the variants that we discovered are necessarily beneficial in a natural setting, where fine-grained regulation of amino acid transport may be more important than a broad substrate range. Regarding the generality of the insights, we do agree on the reviewer’s comment that we “only” analyzed a relatively small number of variants. However, the target of the study was not to generate high-throughput data on a large set of variants (e.g., by NGS of the whole culture) but to provide in-depth data for characterized and verified variants in a clean genetic background (i.e., verified phenotype and fitness measurements on all native and novel substrates).

      As to the mutation space, we will include an estimate in a revised version of the text. We estimate that a majority of all possible single mutants is covered in the first and second passages of the selection experiment, which is corroborated by the fact that we repeatedly find the same mutants in biological replicates.

      Regarding the mentioned loss-of-function analyses, we are unsure about what the reviewer intends with this statement at this point. To briefly summarize, we feel that our results are a good indication that transporters can evolve new functions analogously to enzymes. We explicitly do not imply that this is the only way to evolve novelty.

      Reviewer #3 (Public Review):

      The goal of the current manuscript is to investigate how changes in transporter substrate specificity emerge through experimental evolution. The authors investigate the APC family of amino acid transporters, a large family with many related transporters that together cover the spectrum of amino acid uptake in yeast.

      The authors use a clever approach for their experimental evolutions. By deleting 10 amino acid uptake transporters in yeast, they develop a strain that relies on amino acid import by introducing APC transporters under nitrogen-limiting conditions. They can thus evolve transporters towards the transport of new substrates if no other nitrogen source is available. The main takeaway from the paper is that it is relatively easy for the spectrum of substrates in a particular transporter of this family to shift, as a number of single mutants are identified that modulate substrate specificity. In general, transporters evolved towards gain-of-function mutations (better or new activities) and also confer transport promiscuity, expanding the range of amino acids transported.

      The data in the paper support the conclusions, in general, and the outcomes (evolution towards promiscuity) agree with the literature available for soluble enzymes. However, it is also a possibility that the design of these experiments selects for promiscuity among amino acids. The selections were designed such that yeast had access to amino acids that were already transported, with a greater abundance of the amino acid that was the target of selection. Under these conditions, it seems probable that the fittest variants will provide the yeast access to all amino acid substrates in the media, and unlikely that a specificity swap would occur, limiting the yeast to only the new amino acid.

      The authors also examine the fitness costs of mutants, but only in the narrow context of growth on a single (original) amino acid under conditions of nitrogen limitation. Amino acid uptake is typically tightly controlled because some amino acids (or their carbon degradation products) are toxic in excess. This paper does not address or discuss whether there might be a fitness cost to promiscuous mutants in conditions where nitrogen is not limiting.

      We are grateful for the reviewer’s insightful comments on the paper.

      Regarding the design of our experiments, we followed the concept of directed evolution as described by pioneers of the field, in which the starting point for evolving a protein is to have a basic level of that activity. In the case of AGP1, the promiscuous activity is Cit uptake. We recognize that elimination of all the already transported amino acids from the evolution media could also yield very insightful results. However, we aimed to simulate the effect of the evolutionary pressure acting in a “natural” environment, where the uptake of the specific amino acid is not initially crucial for its survival. In the case of PUT4, the experimental design was chosen to ensure the initial survival of the culture (since neither Glu nor Asp support the growth of the strain) by providing a low level of already transported amino acids. In the revised manuscript, we will state this more clearly.

      Regarding the second point, we agree that a short discussion about the potentially detrimental effects of promiscuous transporters would be beneficial for the reader. We will touch on this aspect in the revised version of the text. Indeed, our system is intentionally simplified, as we try to take regulation of transport out of the equation (e.g., by using the constitutive ADH1 promoter as opposed to a nitrogen-regulated one). In a natural setting, microorganisms encounter fluctuations of nutrient availability, necessitating tight control of nutrient transport. This is probably a major reason why microorganisms typically encode transporters with redundant specificities (i.e., promiscuous and specific ones). Otherwise, one very broad-range nutrient transporter would suffice. In our system, we artificially select for broad-range transport, which is reflected in the observed phenotypes of the evolved transporters. We expect that in a natural setting, a broad-range transporter would be a stepping stone to evolve a narrow-range transporter with a new specificity (which is actually what we see in the double-mutant AGP1-NV, with lowered fitness in original substrates and increased fitness in Cit).

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study advances our understanding of the ways in which different types of communication signals differentially affect mouse behaviors and amygdala cholinergic/dopaminergic neuromodulation. Researchers interested in the complex interaction between prior experience, sex, behavior, hormonal status, and neuromodulation should benefit from this study. Nevertheless, the data analysis is incomplete at this stage, requiring additional analysis and description, justification, and - potentially - power to support the conclusions fully. With the analytical part strengthened, this paper will be of interest to neuroscientists and ethologists.

      GENERAL COMMENTS ON REVIEWS AND REVISIONS

      Experimental design

      Here we address questions from several reviewers regarding our periods of neuromodulator and behavioral analysis. First, we recognize that the text would benefit from an overview of the experimental structure different from the narrative we provide in the first paragraphs of the Results. We now include this near the beginning for the Materials and Methods (page 17). We further articulate that the 10-minute time periods were dictated by the sampling duration required to perform accurate neurochemical analyses (and to reserve half of the sample in the event of a catastrophic failure of batch-processing samples). Since neurochemical release may display multiple temporal components (e.g., ACh: Aitta-aho et al., 2018) during playback stimulation, and since these could differ across neurochemicals of interest, we decided to collect, analyze, and report in two stimulus periods as well as one Pre-Stim control. We now clarify this in additional text in the Material and Methods (p. 24, lines 20-22; p. 26, lines 17-19). We decided not to include analyses of the post-stimulus period because this is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback.

      We also sought to clarify the meaning of the periods “Stim 1” and “Stim 2”; they are two data collection periods, using the same examplar sequences in the same order. We have added statements in the Material and Methods (p. 18, lines 4-7; Fig. caption, p. 39, lines 11-13) to clarify these periods.

      For behavioral analyses, observation periods were much shorter than 10 mins, but the main purpose of behavioral analyses in this report is to relate to the neurochemical data. As a result, we matched the temporal features of the behavioral and neurochemical analyses (p. 22, lines 17-22). We plan a separate report, focused exclusively on a broader set of behavioral responses to playback, that may examine behaviors at a more granular level.

      Data and statistical analyses

      Reviewers 1 and 3 expressed concerns about our normalization of neurochemical data, suggesting that it diminishes statistical power or is not transparent. We note that normalization is a very common form of data transformation that does not diminish statistical power. It is particularly useful for data forms in which the absolute value of the measurement across experiments may be uninformative. Normalization is routine in microdialysis studies, because data can be affected by probe placement and factors affecting neurochemical recovery and processing. Recent examples include:

      Li, Chaoqun, Tianping Sun, Yimu Zhang, Yan Gao, Zhou Sun, Wei Li, Heping Cheng, Yu Gu, and Nashat Abumaria. "A neural circuit for regulating a behavioral switch in response to prolonged uncontrollability in mice." Neuron (2023).

      Gálvez-Márquez, Donovan K., Mildred Salgado-Ménez, Perla Moreno-Castilla, Luis Rodríguez-Durán, Martha L. Escobar, Fatuel Tecuapetla, and Federico Bermudez-Rattoni. "Spatial contextual recognition memory updating is modulated by dopamine release in the dorsal hippocampus from the locus coeruleus." Proceedings of the National Academy of Sciences 119, no. 49 (2022): e2208254119.

      Holly, Elizabeth N., Christopher O. Boyson, Sandra Montagud-Romero, Dirson J. Stein, Kyle L. Gobrogge, Joseph F. DeBold, and Klaus A. Miczek. "Episodic social stress-escalated cocaine self-administration: role of phasic and tonic corticotropin releasing factor in the anterior and posterior ventral tegmental area." Journal of Neuroscience 36, no. 14 (2016): 4093-4105.

      Bagley, Elena E., Jennifer Hacker, Vladimir I. Chefer, Christophe Mallet, Gavan P. McNally, Billy CH Chieng, Julie Perroud, Toni S. Shippenberg, and MacDonald J. Christie. "Drug-induced GABA transporter currents enhance GABA release to induce opioid withdrawal behaviors." Nature neuroscience 14, no. 12 (2011): 1548-1554.

      However, since all reviewers requested raw values of neurochemicals, we provide these in supplementary tables 1-3. The manuscript references these table early in the Results (p. 6, lines 18-19) and in the Material and Methods (p. 27, lines 3-4)

      All reviewers commented on correlation analyses that we presented, with different perspectives. Reviewer 2 questioned the validity of such analyses, performed across experimental groups, while Reviewer 1 pointed out that the analyses were redundant with the GLM. We agree with these criticisms, and note the challenges associated with correlations involving behaviors for which there is a “floor” in the number of observations. As a result, we have removed most correlation analyses from the manuscript. The text and figures have been modified accordingly. Due these changes, we have to decline requests of Reviewer 3 to include many more such analyses. While correlation analyses could still be performed between neurochemicals and behaviors for each group, the relatively small size of each experimental group, the large number of groups, and the even larger numbers of pairings between neurochemicals and behavior, the statistical power is very low. The only correlations we utilize in the manuscript concern the interpretation of our increased acetylcholine levels.

      As part of this revision, we re-ran our statistical analyses on neuromodulators because of a calculation error in 3 animals (regarding baseline values). In a few instances, a significance level changed, but none of these changed a conclusion regarding neuromodulator changes under our experimental conditions.

      Other revisions

      INTRODUCTION: We modified the Introduction to provide both a more general framework and specific gaps in our understanding relating neuromodulators with vocal communication.

      DISCUSSION: We have added material in the first two pages of the Discussion to provide more framework to our conclusions, to address the issues of the temporal aspects of neurochemical release and behavioral observations, and to identify limitations that should be addressed in future studies.

      FIGURES: All figures are now in the main part of the manuscript. We modified most figures in response to reviewer comments. We removed neuromodulator – behavior correlations from several figures. We modified all box plots to ensure that all data points are visible. The visible data points match the numbers reported in figure captions. We brought 5-HIAA data into the main figures reporting on neuromodulator results.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript addresses a fundamental question about how different types of communication signals differentially affect brain states and neurochemistry. In addition, the manuscript highlights the various processes that modulate brain responses to communication signals, including prior experience, sex, and hormonal status. Overall, the manuscript is well-written and the research is appropriately contextualized. The authors are thoughtful about their quantitative approaches and interpretations of the data.

      That being said, the authors need to work on justifying some of their analytical approaches (e.g., normalization of neurochemical data, dividing the experimental period into two periods (as opposed to just analyzing the entire experimental period as a whole)) and should provide a greater discussion of how their data also demonstrate dissociations between neurochemical release in the basolateral amygdala and behavior (e.g., neurochemical differences during both of the experimental periods but behavioral differences only during the first half of the experimental period). The normalization of neurochemical data seems unnecessary given the repeated-measures design of their analysis and could be problematic; by normalizing all data to the baseline data (p. 24), one artificially creates a baseline period with minimal variation (all are "0"; Figures 2, 3 & 5) that could inflate statistical power.

      Please see our general responses to structure of observation periods and normalization of neuromodulator data. Normalization is a common and appropriate procedure in microdialysis studies that does not alter statistical power.

      We have included a section in the Discussion concerning the temporal relationship between behavioral responses and neurochemical changes in response to vocal playback (p. 12, lines 3-17). We note where the linkage is particularly strong (e.g., ACh release and flinching). This points to a need to examine these phenomena with finer temporal resolution, but also with the recognition that the brain circuits driving a behavioral response may extend beyond the BLA.

      The Introduction could benefit from a priori predictions about the differential release of specific neuromodulators based on previous literature.

      We added some material to the Introduction to provide additional rationale for the study. However, we did not attempt to develop predictions for the range of neuromodulators that we sought to test. The literature can lead to opposite predictions for a given neuromodulator. For example, acetylcholine could be associated with both positive and negative valence. Instead, we note in the Introduction the association of both DA and ACh with vocalizations.

      The manuscript would also benefit from a description of space use and locomotion in response to different valence vocalizations.

      We have provided additional descriptions of space use and video tracking data in Material and Methods (p. 23, lines 1-6). We now report a few correlations based on these data in the Results to demonstrate that increased ACh in Restraint males and Mating estrus females was not related to the amount of locomotion (p. 9, lines 8-14).

      Nevertheless, the current manuscript seems to provide some compelling support for how positive and negative valence vocalizations differentially affect behavior and the release of acetylcholine and dopamine in the basolateral amygdala. The research is relevant to broad fields of neuroscience and has implications for the neural circuits underlying social behavior.

      Reviewer #2 (Public Review):

      Ghasemahmad et al. report findings on the influence of salient vocalization playback, sex, and previous experience, on mice behaviors, and on cholinergic and dopaminergic neuromodulation within the basolateral amygdala (BLA). Specifically, the authors played back mice vocalizations recorded during two behaviors of opposite valence (mating and restraint) and measured the behaviors and release of acetylcholine (ACh), dopamine (DA), and serotonin in the BLA triggered in response to those sounds.

      Strength: The authors identified that mating and restraint sounds have a differential impact on cholinergic and dopaminergic release. In male mice, these two distinct vocalizations exert an opposite effect on the release of ACh and DA. Mating sounds elicited a decrease of Ach release and an increase of DA release. Conversely, restraint sounds induced an increase in ACh release and a trend to decrease in DA. These neurotransmission changes were different in estrus females for whom the mating vocalization resulted in an increase of both DA and ACh release.

      Weaknesses: The behavioral analysis and results remain elusive, and although addressing interesting questions, the study contains major flaws, and the interpretations are overstating the findings.

      Although Reviewer 2 raises several valid issues that we have addressed in our response and revision, we believe that none represent “major flaws” in the study that challenge the validity of our central conclusions. In brief, we will:

      --provide enhanced description of behaviors (pp. 22-23 and Table 1)

      --clarify / modify box-plot representations of data (p 28. Lines 3-9)

      --point to our methods that describe corrections for multiple comparisons (p. 27; lines 15-16)

      --revise figures to clarify sample size (Figs. 3-6)

      Reviewer #3 (Public Review):

      Ghasemahmad et al. examined behavioral and neurochemical responses of male and female mice to vocalizations associated with mating and restraint. The authors made two significant and exciting discoveries. They revealed that the affective content of vocalizations modulated both behavioral responses and the release of acetylcholine (ACh) and dopamine (DA) but not serotonin (5-HIAA) in the basolateral amygdala (BLA) of male and female mice. Moreover, the results show sex-based differences in behavioral responses to vocalizations associated with mating. The authors conclude that behavior and neurochemical responses in male and female mice are experience-dependent and are altered by vocalizations associated with restraint and mating. The findings suggest that ACh and DA release may shape behavioral responses to context-dependent vocalizations. The study has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the BLA while an animal listens to social vocalizations; however, multiple concerns must be addressed to substantiate their conclusions.

      Major concerns:

      1) The authors normalized all neurochemical data to the background level obtained from a single pre-stimulus sample immediately preceding playback. The percentage change from the background level was calculated based on a formula, and the underlying concentrations were not reported. The authors should report the sample and background concentrations to make the results and analyses more transparent. The authors stated that NE and 5-HT had low recovery from the mouse brain and hence could not be tracked in the experiment. The authors could be more specific here by relating the concentrations to ACh, DA, and 5-HIAA included in the analyses.

      Please see our general statement regarding normalization of neurochemical data. We have added supplemental tables that shows concentrations of dopamine, acetylcholine, 5-HIAA. We do not report serotonin or noradrenalin since these were below the detection threshold.

      2) For the EXP group, the authors stated that each animal underwent 90-min sessions on two consecutive days that provided mating and restraint experiences. Did the authors record mating or copulation during these experiments? If yes, what was the frequency of copulation? What other behaviors were recorded during these experiences? Did the experiment encompass other courtship behaviors along with mating experiences? Was the female mouse in estrus during the experience sessions?

      In the mating experience, mounting or attempted mounting was required for the animal to be included in subsequent testing. Since the session lasted 90 minutes, more general courtship behavior was likely. However, we did not record detailed behaviors or track estrous stage for the mating experience. See p. 21, line 20-22.

      3) For the mating playback, the authors stated that the mating stimulus blocks contained five exemplars of vocal sequences emitted during mating interactions. The authors should clarify whether the vocal sequences were emitted while animals were mating/copulating or when the male and female mice were inside the test box. If the latter was the case, it might be better to call the playback "courtship playback" instead of "mating playback".

      We have modified the Results (p. 5, lines 18-20) and Materials and Methods (p. 21, lines 8-15) to clarify our meaning. We continue to use the term “mating” because this refers to a specific set of behaviors associated with mounting and copulation, rather than the more general term “courtship”. We also indicate that we based these behaviors on previous work (e.g., Gaub et al., 2016).

      4) Since most differences that the authors reported in Figure 3 were observed in Stim 1 and not in Stim 2, it might be better to perform a temporal analysis - looking at behaviors and neurochemicals over time instead of dividing them into two 10-minute bins. The temporal analysis will provide a more accurate representation of changes in behavior and neurochemicals over time.

      Please see our general response to the structuring of experimental periods. The 10-min periods are the minimum for the neurochemical analyses, and we adopted the same periods for behavioral analyses to match the two types of observations. Our repeated measures analysis is a form of temporal analysis, since it compares values in three observation periods.

      5) In Figures 2 and 3, the authors show the correlation between Flinching behavior and ACh concentration. The authors should report correlations between concentrations of all neurochemicals (not just ACh) and all behaviors recorded (not just Flinching), even if they are insignificant. The analyses performed for the stim 1 data should also be performed on the stim 2 data. Reporting these findings would benefit the field.

      Please see general comments regarding correlation analyses. We removed almost all such analyses and references to them from the manuscript based on concerns of the other reviewers.

      6) The mice used in the study were between p90 - p180. The mice were old, and the range of ages was considerable. Are the findings correlated with age? The authors should also discuss how age might affect the experiment's results.

      Our p90-p180 mice are not “old”. CBA/CaJ mice display normal hearing for at least 1 year (Ohlemiller, Dahl, and Gagnon, JARO 11: 605-623, 2010) and adult sexual and social behavior throughout our observation period. They are sexually mature adults, appropriate for this study. We decline to perform correlation analyses with age, both because this was not a question for this study and because the very large number of correlations, for each experimental group (as requested by reviewer #2), render this approach statistically problematic.

      7) The authors reported neurochemical levels estimated as the animals listened to the sounds played back. What about the sustained effects of changes in neurochemicals? Are there any potential long-term effects of social vocalizations on behavior and neurochemical levels? The authors might consider discussing long-term effects.

      We have not included discussion of long term effects of neuromodulatory release, both because our data analysis doesn’t address it (see response to Comment #10) and because we desired to keep the Discussion focused on topics more closely related to the results.

      8) Histology from a single recording was shown in supplementary figure 1. It would benefit the readers if additional histology was shown for all the animals, not just the colored schematics summarizing the recording probe locations. Further explanation of the track location is also needed to help the readers. Make it clear for the readers which dextran-fluorescein labeling image is associated with which track in the schematic.

      Based on the recent publications cited in our overall response to reviewer comments about statistical methods, our reporting of histological location of microdialysis exceeds the standard. We believe that the inclusion of all histology is unnecessary and not particularly helpful. Raw photomicrographs do not always illustrate boundaries, so interpretation is required. However, we added a second photomicrograph example and we identified which tracks correspond to these photomicrographs (see Figure 2; now in main body of manuscript).

      9) The authors did not control for the sounds being played back with a speaker. This control may be necessary since the effects are more pronounced in Stim 1 than in Stim 2. Playing white noise rather than restraint or courtship vocalizations would be an excellent control. However, the authors could perform a permutation analysis and computationally break the relationship between what sound is playing and the neurochemical data. This control would allow the authors to show that the actual neurochemical levels are above or below chance.

      We considered a potential “control” stimulus in our experimental design. We concluded, based on our previous work (e.g., Grimsley et al., 2013; Gadziola et al., 2016), that white noise is not or not necessarily a neutral stimulus and therefore the results would not clarify the responses to the two vocal stimuli. Instead, we opted to use experience as a type of control. This control shows very clearly that temporal patterns and across-group differences in neurochemical response to playback disappear in the absence of experience with the associated behavior.

      10) The authors indicated that each animal's post-vocalization session was also recorded. No data in the manuscript related to the post-vocalization playback period was included. This omission was a missed opportunity to show that the neurochemical levels returned to baseline, and the results were not dependent on the normalization process described in major concern #1. The data should be included in the manuscript and analyzed. It would add further support for the model described in Figure 6.

      We decided not to include analyses of the post-stimulus period because this period is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback. We agree that the general question is of interest to the field, but we don’t think our study is best designed to answer that question.

      11) The authors could use a predictive model, such as a binary classifier trained on the CSF sampling data, to predict the type of vocalizations played back. The predictive model could support the conclusions and provide additional support for the model in Figure 6.

      We recognize that a binary classifier could provide an interesting approach to support conclusions. However, we do not believe that the sample size per group is sufficient to both create and test the classifier.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      • Introduction: It would be useful to set up an experimental framework before delving into the results. What are the predictions about specific neuromodulators based on previous literature?

      Because this narrative is laid out in the first two paragraphs of the Results, which immediately follow the Introduction, we believe that additional text in the Introduction on the experimental framework is redundant. As stated above, detailing predictions for a range of neuromodulators would make for a long and not particularly illuminating Introduction. We instead have related our findings to more general understanding of DA and ACh in the Discussion.

      • There really isn't a major difference in stimuli during the "Stim 1" and "Stim 2" phases, and it's not clear why the authors divided the experimental period into two phases. Therefore, the authors need to justify their experimental approach. For example, the authors could first anecdotally mention that behavioral responses to playbacks seem to be larger in the first half of the playbacks than during the second half, therefore they individually analyzed each half of the experimental period. Or adopt a different approach to justify their design. Overall, the analytical approach is reasonable but it is currently not justified.

      See general comment for analysis periods. As noted, we clarified these issues in several locations with Materials and Methods (pp. 24, lines 20-22; p. 26, lines 17-19). We also sought to clarify the meaning of the periods “Stim 1” and “Stim 2”; they are two data collection periods, using the same examplar sequences in the same order. We have added statements in the Material and Methods (p. 18, lines 4-7; Fig. caption, p. 39, lines 11-13).

      • The normalization of neurochemical data seems problematic and unnecessary. By normalizing all data to the baseline data (p. 24), one artificially creates a baseline period with minimal variation (all are "0"; Figures 2, 3 & 5) and this has implications for statistical power. Because the analysis is a within-subjects analysis, this normalization is not necessary for the analysis itself. It can be useful to normalize data for visualization purposes, but raw data should be analyzed. Indeed, behavioral data are qualitatively similar to the neurochemical data, and those data are not normalized to baseline values.

      Please see our general comment on this issue. We believe normalization does not affect statistical power and is both the standard way and an appropriate way to analyze microdialysis results. We include concentrations of ACh, DA, and 5-HIAA in supplementary tables?

      • The authors should include a discussion (in the Discussion section) of how behavior and neurochemical release are associated during the first half of the experimental session but not in the second half (e.g., differences in Ach and DA release between mating and restraint groups during stim 1 and 2, but behavioral differences only during stim 1).

      We have included a section in the Discussion concerning the temporal relationship between behavioral responses and neurochemical changes in response to vocal playback. We note that the linkage is particularly strong in some cases (e.g., ACh release and flinching). This points to a need to examine these phenomena with finer temporal resolution, but also with the recognition that the brain circuits driving a behavioral response may extend beyond the BLA.

      Minor comments:

      • Keywords: add "serotonin" (even though there are no significant differences on 5-HIAA, people interested in serotonin would find this interesting).

      Added to keywords list.

      • Do the authors collect data on the vocalizations of mice in response to these playbacks?

      We monitored vocalizations during playback, noting that vocalizations–especially “Noisy” vocalization–were common. However, we did not record vocalizations and are therefore unable quantify our observations.

      • First line of page 7: readers do not know about "stim 1" and "stim 2". Therefore, the authors need to describe their approach to analyzing behavior and neurochemical release.

      We first introduce these terms earlier, citing Figure 1D,E. We have added some additional wording for further clarification. page 7, lines 4-5.

      • Make sure citations are uniformly formatted (e.g., Inconsistencies in: "As male and female mice emit different vocalizations during mating (Finton et al., 2017; J. M. S. Grimsley et al., 2013; Neunuebel et al., 2015; Sales (née Sewell), 1972)").

      We have reviewed and corrected citations throughout the manuscript.

      • Last paragraph of page 7: "attending behavior" has not been defined yet.

      Table 1 contains our description of the behaviors analyzed in this study. We have now inserted a reference to Table 1 earlier in the Results (p. 6, line 12).

      • Figure 2E and 3G: I find these correlations to be redundant with the GLMs. This is because the significant relationship is likely to be driven by group differences in behavior and in neurochemical release.

      Please see general comments regarding correlation analyses. We removed such analyses and references to them from the manuscript.

      • Page 2, 2nd paragraph, 2nd sentence: this paragraph seems to be rooted in comparing and contrasting experienced and inexperienced mice, so there should be explicit comparisons in each sentence. For example, the 2nd sentence should read: "Whereas EXP estrus females demonstrated increased flinching behaviors in response to mating vocalizations, INEXP ....". This paragraph overall could use some refining.

      We believe this refers to page 9. We have revised the paragraph to clarify our findings (Beginning p. 9, line 23).

      • Page 9: "Further, there were no significant differences across groups during Stim 1 or Stim 2 periods. These results contrast sharply with those from all EXP groups, in which both ACh and DA release changed significantly during playback (Figs. 2C, 2D, 3E, 3F)." While I understand their perspective, this is misleading because changes were only observed during the Stim 1 period.

      We have slightly revised the wording in this paragraph, because the restraint males did not show significant ACh decreases. However, we do not believe our statements mislead readers just because some changes are observed in only one of the stimulation periods (p 10, lines 13-16).

      • Last paragraph of page 14: it would be useful to mention the increase in flinching in experienced females in response to mating vocalizations.

      We have added a sentence in this paragraph relating flinching in estrus females to increased ACh (p. 15, lines 18-20).

      • Was there a full analysis of locomotion in response to playbacks? I see that locomotion was correlated with neurochemical release but was it different in response to different stimuli? Were there changes to the part of the arena that mice occupied in response to restraint vs. mating vocalizations? Given their methods section, it would be useful for the authors to mention the results of the analyses of these aspects of movement.

      We have provided additional descriptions of space use and video tracking data in Material and Methods (p. 23, lines 1-6). We now report additional results associated with these analyses (p. 8, lines 13-15; p. 9, lines 8-14).

      • I believe that each experimental mouse only heard one of the stimuli (given the analytical approach). Because it is plausible to measure neurochemical release in response to both types of stimuli, I encourage the authors to be more explicit about this aspect of the experimental design (e.g., mention in Results section).

      Sentence modified to read: “Each mouse received playback of either the mating or restraint stimuli, but not both: same-day presentation of both stimuli would require excessively long playback sessions, the condition of the same probe would likely change on subsequent days, and quality of a second implanted probe on a subsequent day was uncertain.” (p. 7, lines 5-9).

      • Figure 1A and 1B: add labels to the panels so readers don't have to read the legend to know what spectrogram is associated with what context.

      We added these labels to Figure 1.

      • Table 1: in the definition of "still and alert", should this mention "abrupt attending" instead of "abrupt freezing"? The latter isn't described.

      Yes, we intended “abrupt attending”, and now indicated that in Table 1

      Reviewer #2 (Recommendations For The Authors):

      Major comments:

      • The authors report they performed manual behavioral analysis, and provide a table defining the different behaviors. However, it remains unclear how some of these behaviors were detected (such as still-and-alert events). A thorough description of the criteria used to define these events needs to be provided.

      We have modified some descriptions of manually analyzed behaviors in Table 1, and have added additional description of how we developed this set of behaviors for analysis in the study (pp. 22-23).

      • The box plots do not appear to represent the "minimum, first quartile, median, third quartile, and maximum values." as specified on page 24 (Methods). Indeed, the individual data points sometimes do not reach the max or min of the bar plot, and sometimes are way beyond them.

      We used the “inclusive median” function in Excel to generate final boxplots. These boxplots will sometimes result in a data point being placed outside of the whiskers. SPSS considers these to be “outliers”, but our GLM analysis includes these values. We describe this in Data Analysis section of Materials and Methods (p. 28, lines 3-9)

      • Some of the data are replicated in different Figures: Figure 2A and Figure 3C. While this is acceptable, the authors did not correct for multiple comparisons (dividing the p value by the number of comparisons).

      Our analysis included corrections for multiple comparisons, as we have indicated on p. 27, lines 15-16.

      • Overall, the sample sizes are too small (for example in Figure 3, non-estrus females are at n=3), and are different in experiments where they should be equal (Figure 2B: mating stim 1 is at n=5 and mating stim 2 is at n=3).

      We apologize that sample sizes were not properly displayed in figures. Please note that sample sizes are identified in the figure captions. For neuromodulator data, all sample sizes are at least 7. For behavioral data, the minimum sample size is 5. We have revised Figures 3-6 to ensure that all data points are visible.

      • It remains unclear why the impact of mating vocalizations has been tested only in males.

      We assume the reviewer meant that only males were tested in restraint. We now indicate that our preliminary evidence indicated no difference in behavioral responses to restraint vocalization between males and females, so we opted to perform the neurochemical analysis for restraint only in males (page 22 lines 4-5). If there were no limitations to time and cost, we would have preferred to test responses to restraint in females as well. We note that such inclusion would have added up to 4 experimental groups (estrus and non-estrus groups in both EXP and INEXP groups).

      • The correlation between the number of flinching and ACh release changes (Figure 2E) visually appears to be opposite between mating and restraint playbacks. The authors should perform independent correlations for these 2 playbacks.

      Please see general comments regarding correlation analyses. We removed such analyses and references to them from the manuscript.

      • The authors state that their findings "indicate that behavioral responses to salient vocalizations result from interactions between sex of the listener or context of vocal stimuli with the previous behavioral experience associated with these vocalizations.". However, in male mice, they do not report any difference in previous experience on flinching for both restraint and mating sounds, as well as no difference in rearing for the restrain sounds (Figure 4A-B). Thus, the discussion of these results should be completely revisited.

      We revised the paragraph in question (p. 9, line 22 through p. 10, line 9). For instance, we note that significant differences between EXP male-mating and male-restraint flinching do not exist between the INEXP groups. We believe that the last sentence correctly summarizes findings described in this paragraph.

      • For serotonin experiments in Figure S2 there are strong outliers (150% increase in 5HIAA release). Did the authors correlate these levels with the behavior of the animals?

      Outliers are identified by the Excel function that generated the boxplots, but we have no reason to consider these as outliers and exclude them. As noted above, we have clarified that these “outliers” are the result of the Excel function in the Materials and Methods (p. 28, lines 3-9) and we have revised the plotting of data points

      Minor comments:

      • Mating vocalization playback is mainly emitted by males, thus, instead of a positive valence signal, this could also be interpreted as a competitive signal to other males.

      There is support in the literature for viewing our mating stimulus as having positive valence. Gaub et al., 2016 describe the emission of stepped calls, lower frequency harmonics, and increased sound level as indicators of “positive emotion”. We have shown (Grimsley et al, 2013) that the female LFH vocalization can be highly attractive to male mice, under the right conditions, indicating something like “sex is happening”. The inclusion of both the male and female vocalizations in our stimuli was a key piece of our experimental design, based on our understanding of the contributions of both vocalizations to the meaning of the overall acoustic experience.

      • Figure 1 should include panel titles.

      No change. This information is available in the Figure caption.

      • n=31 should be indicated in the EXP group.

      We’re not sure where the reviewer is referring to this value.

      • The color legend of Figure 1E is absent, making the Figure not understandable.

      We added text in the Figure 1 caption to indicate that each color represents a different exemplar. We don’t think a legend provides additional useful information.

      • The point of making two blocks (stim 1 and stim2) should be stated more clearly.

      Please see general statement regarding experimental blocks. We have modified our description of these in an Experimental overview section in the Material and Methods.

      • Including raw data of micro-dialysis in the supplementary figures would allow assessment of the variability and quality of the measurements.

      We have added concentrations of neurochemicals in supplemental tables 1-3.

      • Baseline (prestimulus) number of flinch and rearing should systematically be indicated (missing in Figure 4).

      The focus in this figure is on the differences that occur in Stim 1 values. There are no differences between EXP and INEXP animals of any group during the Pre-Stim period. We now state that in the Figure 4 caption.

      • Discussion: "increase in AMPA/NMDA currents". We believe the authors are referring to the ratio of AMPA to NMDA currents. This sentence should be reformulated.

      These are modified to refer to “… the AMPA/NMDA current ratio…” in two locations in the Discussion (p. 14, lines 8-9; p. 15, line 4)

      • Overall the discussion is very speculative and should rely more on the data.

      We believe that the Discussion provides appropriate speculation that is based on our experimental data and previous literature. We have added a paragraph to identify limitations of our findings and recommendations of future experiments to resolve some issues (p. 12, lines 3-17)

      Reviewer #3 (Recommendations For The Authors):

      Minor concerns:

      1) The authors stated that USVs are most likely to be emitted by males, and LFH are likely to be emitted by females. However, Oliveira-Stahl et al. 2023, Matsumoto et al. 2022, Warren et al. 2018, Heckman et al. 2017, Neunuebel et al., 2015 showed that females also emit USVs. The authors should mention that USVs are emitted by both males and females and discuss how the sex of the vocalizing animal (both males and females) can influence neuromodulator release.

      The reviewer slightly mis-stated the wording of our text, changing the meaning significantly. Our wording is “These sequences included ultrasonic vocalizations (USVs) with harmonics, steps, and complex structure, mostly emitted by males, and low frequency harmonic calls (LFHs) emitted by females (Fig. 1A,C)…” This phrasing is correct and carefully chosen. The Discussion in Oliveira-Stahl et al 2023 (p. 10-11) supports our statement: “The exact fraction of USVs emitted by females as concluded in all previous studies on dyadic courtship has varied, ranging from 18%, 17.5%, and 16% to 10.5% in the present study…”.

      2) The authors should explain why ECF from BLA was collected unilaterally from the left hemisphere.

      p. 23, lines 9-11: We inserted a sentence to explain why we targeted the BLA unilaterally. “Since both left and right amygdala are responsive to vocal stimuli in human and experimental animal studies (Wenstrup et al., 2020), we implanted microdialysis probes into the left amygdala to maintain consistency with other studies in our laboratory..” Beyond that, the choice was arbitrary.

      3) The authors said each animal recovered in its home cage for four days before the playback experiment. A 4-day period may not be sufficient for every animal to recover from surgery, so the authors should describe how a mouse's recovery was assessed.

      p. 23, lines 20-23: We provide more description about the recovery and how it was assessed. Except for a few animals that were not included in the experiments, all animals recovered within 4 days.

      4) The authors stated that each animal was exposed to 90-min sessions with mating and restraint behaviors in a counterbalanced design. This description for Figure 1D should also include the duration of the mating and restraint experience.

      The Results that immediately precede citation to this figure include this information.

      5) The authors stated, "Data are reported only from mice with more than 75% of the microdialysis probe implanted within the BLA". What are the implications of having 25% of the probe outside the BLA? The authors should shed more light on this by discussing this issue as it relates to the findings and commenting on where the other 25% of the probe was located.

      We inserted a sentence to explain the rationale for this inclusion criterion. “We verified placement of microdialysis probes to minimize variability that could arise because regions surrounding BLA receive neurochemical inputs from different sources (e.g., cholinergic inputs to putamen and central amygdala).” (p. 25, lines 21-23).

      All brain regions that surround BLA, dorsal, medial, ventral, or lateral, could have been sampled by the “other” 25%. Some of these, e.g., the central amygdala or caudate-putamen, have different sources of cholinergic input that may not have the same release pattern. We do not think it is worthy of further speculation in the Discussion. Due to the high cost of the neurochemical analysis, we often did not process the neurochemistry data if histology indicated that a probe missed the BLA target.

      6) The authors confirmed that the estrus stage did not change during the experiment day by evaluating and comparing estrus prior to and after data collection. This strategy was a fantastic experimental approach, but the authors should have discussed the results. How did the results the authors included change when the females were in estrus before but not after data collection? What percentage of females started in estrus but ended in metestrus? Assuming that some females changed estrus state, were these animals excluded from the analyses?

      All animals were in the same estrus state at the beginning and end of the playback session.

      7). Authors cite Neunuebel et al., 2015 for the sentence "As male and female mice emit different vocalizations during mating". However, Neunuebel et al., 2015 showed vocalizations emitted during chasing--not mating. If mating is a general term for courtship, then this reference is appropriate, but see major concern #3.

      In the Results (p. 8, line 5), we changed the phrasing to “courtship and mating” to include the Neunubel et al study.

      As we indicate in our response to Public Comment #3, we have modified the Results (p. 5, lines 18-20) and Materials and Methods (p. 21, lines 8-15) to clarify our meaning. We continue to use the term “mating” because this refers to a specific set of behaviors associated with mounting and copulation, rather than the more general term “courtship”. We also indicate that we based these behaviors on previous work (e.g., Gaub et al., 2016).

      8) Authors interpret Figure 3F as DA release showed a "consistent" increase during mating playback across all three experimental groups. However, the increase in the estrus female group is inconsistent, as seen in the graph. This verbiage should be reworded to describe the data more accurately.

      p. 8, line 23 “consistent” was deleted.

      9) In all the box plots, multiple data points overlay each other. A more transparent way of showing the data would be adding some jitter to the x value to make each data point visible. The mean (X's) in Figure 3D (pre-stim mating and mating estrus) are difficult to see, as are all the data points in mating non-estrus. Adding all the symbols to the figure legend or a key in the figure instead of the method section would aid the reader and make the plots easier to interpret

      We have revised the boxplots to ensure that all data points are visible.

      10) Some verbiage used in the discussion should be toned down. For example, "intense" experiences and "emotionally charged" vocalizations should be removed.

      We have not changed these terms, which we believe are appropriate to describe these experiences and vocalizations.

      11) The authors include "Emotional Vocalizations" in the title. It would be beneficial if the authors included more detail and references in the introduction to help set up the emotional content of vocalizations. It may benefit a broader readership as typically targeted by eLife.

      We now cite Darwin and some more recent publications that articulate the general understanding that social vocalizations carry emotional content.

    2. eLife assessment

      This important study advances our understanding of how different types of communication signals differentially affect mouse behaviors and amygdala cholinergic/dopaminergic neuromodulation. Researchers interested in the complex interaction between prior experience, sex, behavior, hormonal status, and neuromodulation should benefit from this study. Nevertheless, some of the statistical comparisons using baseline normalized data might result in inadequate statistical power at this stage, requiring additional analysis to support the conclusions fully. With a few analytical parts strengthened, this paper will be of interest to neuroscientists and ethologists.

    3. Reviewer #1 (Public Review):

      The manuscript addresses a fundamental question about how different types of communication signals differentially affect brain state and neurochemistry. In addition, their manuscript highlights the various processes that modulate brain responses to communication signals, including prior experience, sex, and hormonal status. Overall, the manuscript is well-written and the research is appropriately contextualized.

      That being said, it remains important for the authors to think more about their analytical approaches. In particular, the effect of normalization and the explicit outlining and interpretations of statistical models. As mentioned in the original review, the normalization of neurochemical data seems unnecessary given the repeated-measures design of their analysis and by normalizing all data to the baseline data and including this baseline data in the repeated measures analysis, one artificially creates a baseline period with minimal variation that dramatically differs in variance from other periods (akin to heteroscedasticity). If the authors want to analyze how a stimulus changes neurochemical concentrations, they could analyze the raw data but depict normalized data in their figures (similar to other papers). Or they could analyze group differences in the normalized data of the two stimulus periods (i.e., excluding the baseline period used for normalization).

      It would also be useful for the authors to provide further discussion of the potential contributions of different types of experiences (mating vs. restraint) to the change in behavior and neurochemical responses to the vocalization playbacks and to try to disentangle sensory and motor contributions to neurochemical changes.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents potentially valuable results on glutamine-rich motifs in relation to protein expression and alternative genetic codes. The author's interpretation of the results is so far only supported by incomplete evidence, due to a lack of acknowledgment of alternative explanations, missing controls and statistical analysis and writing unclear to non experts in the field. These shortcomings could be at least partially overcome by additional experiments, thorough rewriting, or both.

      We thank both the Reviewing Editor and Senior Editor for handling this manuscript.

      Based on your suggestions, we have provided controls, performed statistical analysis, and rewrote our manuscript. The revised manuscript is significantly improved and more accessible to non-experts in the field.

      Reviewer #1 (Public Review):

      Summary

      This work contains 3 sections. The first section describes how protein domains with SQ motifs can increase the abundance of a lacZ reporter in yeast. The authors call this phenomenon autonomous protein expression-enhancing activity, and this finding is well supported. The authors show evidence that this increase in protein abundance and enzymatic activity is not due to changes in plasmid copy number or mRNA abundance, and that this phenomenon is not affected by mutants in translational quality control. It was not completely clear whether the increased protein abundance is due to increased translation or to increased protein stability.

      In section 2, the authors performed mutagenesis of three N-terminal domains to study how protein sequence changes protein stability and enzymatic activity of the fusions. These data are very interesting, but this section needs more interpretation. It is not clear if the effect is due to the number of S/T/Q/N amino acids or due to the number of phosphorylation sites.

      In section 3, the authors undertake an extensive computational analysis of amino acid runs in 27 species. Many aspects of this section are fascinating to an expert reader. They identify regions with poly-X tracks. These data were not normalized correctly: I think that a null expectation for how often poly-X track occur should be built for each species based on the underlying prevalence of amino acids in that species. As a result, I believe that the claim is not well supported by the data.

      Strengths

      This work is about an interesting topic and contains stimulating bioinformatics analysis. The first two sections, where the authors investigate how S/T/Q/N abundance modulates protein expression level, is well supported by the data. The bioinformatics analysis of Q abundance in ciliate proteomes is fascinating. There are some ciliates that have repurposed stop codons to code for Q. The authors find that in these proteomes, Q-runs are greatly expanded. They offer interesting speculations on how this expansion might impact protein function.

      Weakness

      At this time, the manuscript is disorganized and difficult to read. An expert in the field, who will not be distracted by the disorganization, will find some very interesting results included. In particular, the order of the introduction does not match the rest of the paper.

      In the first and second sections, where the authors investigate how S/T/Q/N abundance modulates protein expression levels, it is unclear if the effect is due to the number of phosphorylation sites or the number of S/T/Q/N residues.

      There are three reasons why the number of phosphorylation sites in the Q-rich motifs is not relevant to their autonomous protein expression-enhancing (PEE) activities:

      First, we have reported previously that phosphorylation-defective Rad51-NTD (Rad51-3SA) and wild-type Rad51-NTD exhibit similar autonomous PEE activity. Mec1/Tel1-dependent phosphorylation of Rad51-NTD antagonizes the proteasomal degradation pathway, increasing the half-life of Rad51 from ∼30 min to ≥180 min (1). (page 1, lines 11-14)

      Second, in our preprint manuscript, we have already shown that phosphorylation-defective Rad53-SCD1 (Rad51-SCD1-5STA) also exhibits autonomous PEE activity similar to that of wild-type Rad53-SCD (Figure 2D, Figure 4A and Figure 4C). We have highlighted this point in our revised manuscript (page 9, lines 19-21).

      Third, as revealed by the results of Figure 4, it is the percentages, and not the numbers, of S/T/Q/N residues that are correlated with the PEE activities of Q-rich motifs.

      The authors also do not discuss if the N-end rule for protein stability applies to the lacZ reporter or the fusion proteins.

      The autonomous PEE function of S/T/Q-rich NTDs is unlikely to be relevant to the N-end rule. The N-end rule links the in vivo half-life of a protein to the identity of its N-terminal residues. In S. cerevisiae, the N-end rule operates as part of the ubiquitin system and comprises two pathways. First, the Arg/N-end rule pathway, involving a single N-terminal amidohydrolase Nta1, mediates deamidation of N-terminal asparagine (N) and glutamine (Q) into aspartate (D) and glutamate (E), which in turn are arginylated by a single Ate1 R-transferase, generating the Arg/N degron. N-terminal R and other primary degrons are recognized by a single N-recognin Ubr1 in concert with ubiquitin-conjugating Ubc2/Rad6. Ubr1 can also recognize several other N-terminal residues, including lysine (K), histidine (H), phenylalanine (F), tryptophan (W), leucine (L) and isoleucine (I) (68-70). Second, the Ac/N-end rule pathway targets proteins containing N-terminally acetylated (Ac) residues. Prior to acetylation, the first amino acid methionine (M) is catalytically removed by Met-aminopeptidases (MetAPs), unless a residue at position 2 is non-permissive (too large) for MetAPs. If a retained N-terminal M or otherwise a valine (V), cysteine (C), alanine (A), serine (S) or threonine (T) residue is followed by residues that allow N-terminal acetylation, the proteins containing these AcN degrons are targeted for ubiquitylation and proteasome-mediated degradation by the Doa10 E3 ligase (71).

      The PEE activities of these S/T/Q-rich domains are unlikely to arise from counteracting the N-end rule for two reasons. First, the first two amino acid residues of Rad51-NTD, Hop1-SCD, Rad53-SCD1, Sup35-PND, Rad51-ΔN, and LacZ-NVH are MS, ME, ME, MS, ME, and MI, respectively, where M is methionine, S is serine, E is glutamic acid and I is isoleucine. Second, Sml1-NTD behaves similarly to these N-terminal fusion tags, despite its methionine and glutamine (MQ) amino acid signature at the N-terminus. (Page 12, line 3 to page 13, line 2)

      The most interesting part of the paper is an exploration of S/T/Q/N-rich regions and other repetitive AA runs in 27 proteomes, particularly ciliates. However, this analysis is missing a critical control that makes it nearly impossible to evaluate the importance of the findings. The authors find the abundance of different amino acid runs in various proteomes. They also report the background abundance of each amino acid. They do not use this background abundance to normalize the runs of amino acids to create a null expectation from each proteome. For example, it has been clear for some time (Ruff, 2017; Ruff et al., 2016) that Drosophila contains a very high background of Q's in the proteome and it is necessary to control for this background abundance when finding runs of Q's.

      We apologize for not explaining sufficiently well the topic eliciting this reviewer’s concern in our preprint manuscript. In the second paragraph of page 14, we cite six references to highlight that SCDs are overrepresented in yeast and human proteins involved in several biological processes (5, 43) and that polyX prevalence differs among species (79-82).

      We will cite a reference by Kiersten M. Ruff in our revised manuscript (38).

      K. M. Ruff, J. B. Warner, A. Posey and P. S. Tan (2017) Polyglutamine length dependent structural properties and phase behavior of huntingtin exon1. Biophysical Journal 112, 511a.

      The authors could easily address this problem with the data and analysis they have already collected. However, at this time, without this normalization, I am hesitant to trust the lists of proteins with long runs of amino acid and the ensuing GO enrichment analysis. Ruff KM. 2017. Washington University in St.

      Ruff KM, Holehouse AS, Richardson MGO, Pappu RV. 2016. Proteomic and Biophysical Analysis of Polar Tracts. Biophys J 110:556a.

      We thank Reviewer #1 for this helpful suggestion and now address this issue by means of a different approach described below.

      Based on a previous study (43), we applied seven different thresholds to seek both short and long, as well as pure and impure, polyX strings in 20 different representative near-complete proteomes, including 4X (4/4), 5X (4/5-5/5), 6X (4/6-6/6), 7X (4/7-7/7), 8-10X (≥50%X), 11-10X (≥50%X) and ≥21X (≥50%X).

      To normalize the runs of amino acids and create a null expectation from each proteome, we determined the ratios of the overall number of X residues for each of the seven polyX motifs relative to those in the entire proteome of each species, respectively. The results of four different polyX motifs are shown in our revised manuscript, i.e., polyQ (Figure 7), polyN (Figure 8), polyS (Figure 9) and polyT (Figure 10). Thus, polyX prevalence differs among species and the overall X contents of polyX motifs often but not always correlate with the X usage frequency in entire proteomes (43).

      Most importantly, our results reveal that, compared to Stentor coeruleus or several non-ciliate eukaryotic organisms (e.g., Plasmodium falciparum, Caenorhabditis elegans, Danio rerio, Mus musculus and Homo sapiens), the five ciliates with reassigned TAAQ and TAGQ codons not only have higher Q usage frequencies, but also more polyQ motifs in their proteomes (Figure 7). In contrast, polyQ motifs prevail in Candida albicans, Candida tropicalis, Dictyostelium discoideum, Chlamydomonas reinhardtii, Drosophila melanogaster and Aedes aegypti, though the Q usage frequencies in their entire proteomes are not significantly higher than those of other eukaryotes (Figure 1). Due to their higher N usage frequencies, Dictyostelium discoideum, Plasmodium falciparum and Pseudocohnilembus persalinus have more polyN motifs than the other 23 eukaryotes we examined here (Figure 8). Generally speaking, all 26 eukaryotes we assessed have similar S usage frequencies and percentages of S contents in polyS motifs (Figure 9). Among these 26 eukaryotes, Dictyostelium discoideum possesses many more polyT motifs, though its T usage frequency is similar to that of the other 25 eukaryotes (Figure 10).

      In conclusion, these new normalized results confirm that the reassignment of stop codons to Q indeed results in both higher Q usage frequencies and more polyQ motifs in ciliates.  

      Reviewer #2 (Public Review):

      Summary:

      This study seeks to understand the connection between protein sequence and function in disordered regions enriched in polar amino acids (specifically Q, N, S and T). While the authors suggest that specific motifs facilitate protein-enhancing activities, their findings are correlative, and the evidence is incomplete. Similarly, the authors propose that the re-assignment of stop codons to glutamine-encoding codons underlies the greater user of glutamine in a subset of ciliates, but again, the conclusions here are, at best, correlative. The authors perform extensive bioinformatic analysis, with detailed (albeit somewhat ad hoc) discussion on a number of proteins. Overall, the results presented here are interesting, but are unable to exclude competing hypotheses.

      Strengths:

      Following up on previous work, the authors wish to uncover a mechanism associated with poly-Q and SCD motifs explaining proposed protein expression-enhancing activities. They note that these motifs often occur IDRs and hypothesize that structural plasticity could be capitalized upon as a mechanism of diversification in evolution. To investigate this further, they employ bioinformatics to investigate the sequence features of proteomes of 27 eukaryotes. They deepen their sequence space exploration uncovering sub-phylum-specific features associated with species in which a stop-codon substitution has occurred. The authors propose this stop-codon substitution underlies an expansion of ploy-Q repeats and increased glutamine distribution.

      Weaknesses:

      The preprint provides extensive, detailed, and entirely unnecessary background information throughout, hampering reading and making it difficult to understand the ideas being proposed.

      The introduction provides a large amount of detailed background that appears entirely irrelevant for the paper. Many places detailed discussions on specific proteins that are likely of interest to the authors occur, yet without context, this does not enhance the paper for the reader.

      The paper uses many unnecessary, new, or redefined acronyms which makes reading difficult. As examples:

      1) Prion forming domains (PFDs). Do the authors mean prion-like domains (PLDs), an established term with an empirical definition from the PLAAC algorithm? If yes, they should say this. If not, they must define what a prion-forming domain is formally.

      The N-terminal domain (1-123 amino acids) of S. cerevisiae Sup35 was already referred to as a “prion forming domain (PFD)” in 2006 (48). Since then, PFD has also been employed as an acronym in other yeast prion papers (Cox, B.S. et al. 2007; Toombs, T. et al. 2011).

      B. S. Cox, L. Byrne, M. F., Tuite, Protein Stability. Prion 1, 170-178 (2007). J. A. Toombs, N. M. Liss, K. R. Cobble, Z. Ben-Musa, E. D. Ross, [PSI+] maintenance is dependent on the composition, not primary sequence, of the oligopeptide repeat domain. PLoS One 6, e21953 (2011).

      2) SCD is already an acronym in the IDP field (meaning sequence charge decoration) - the authors should avoid this as their chosen acronym for Serine(S) / threonine (T)-glutamine (Q) cluster domains. Moreover, do we really need another acronym here (we do not).

      SCD was first used in 2005 as an acronym for the Serine (S)/threonine (T)-glutamine (Q) cluster domain in the DNA damage checkpoint field (4). Almost a decade later, SCD became an acronym for “sequence charge decoration” (Sawle, L. et al. 2015; Firman, T. et al. 2018).

      L. Sawle and K, Ghosh, A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins. J. Chem Phys. 143, 085101(2015).

      T. Firman and Ghosh, K. Sequence charge decoration dictates coil-globule transition in intrinsically disordered proteins. J. Chem Phys. 148, 123305 (2018).

      3) Protein expression-enhancing (PEE) - just say expression-enhancing, there is no need for an acronym here.

      Thank you. Since we have shown that the addition of Q-rich motifs to LacZ affects protein expression rather than transcription, we think it is better to use the “PEE” acronym.

      The results suggest autonomous protein expression-enhancing activities of regions of multiple proteins containing Q-rich and SCD motifs. Their definition of expression-enhancing activities is vague and the evidence they provide to support the claim is weak. While their previous work may support their claim with more evidence, it should be explained in more detail. The assay they choose is a fusion reporter measuring beta-galactosidase activity and tracking expression levels. Given the presented data they have shown that they can drive the expression of their reporters and that beta gal remains active, in addition to the increase in expression of fusion reporter during the stress response. They have not detailed what their control and mock treatment is, which makes complete understanding of their experimental approach difficult. Furthermore, their nuclear localization signal on the tag could be influencing the degradation kinetics or sequestering the reporter, leading to its accumulation and the appearance of enhanced expression. Their evidence refuting ubiquitin-mediated degradation does not have a convincing control.

      Although this reviewer’s concern regarding our use of a nuclear localization signal on the tag is understandable, we are confident that this signal does not bias our findings for two reasons. First, the negative control LacZ-NV also possesses the same nuclear localization signal (Figure 1A, lane 2). Second, another fusion target, Rad51-ΔN, does not harbor the NVH tag (Figure 1D, lanes 3-4). Compared to wild-type Rad51, Rad51-ΔN is highly labile. In our previous study, removal of the NTD from Rad51 reduced by ~97% the protein levels of corresponding Rad51-ΔN proteins relative to wild-type (1).

      Based on the experimental results, the authors then go on to perform bioinformatic analysis of SCD proteins and polyX proteins. Unfortunately, there is no clear hypothesis for what is being tested; there is a vague sense of investigating polyX/SCD regions, but I did not find the connection between the first and section compelling (especially given polar-rich regions have been shown to engage in many different functions). As such, this bioinformatic analysis largely presents as many lists of percentages without any meaningful interpretation. The bioinformatics analysis lacks any kind of rigorous statistical tests, making it difficult to evaluate the conclusions drawn. The methods section is severely lacking. Specifically, many of the methods require the reader to read many other papers. While referencing prior work is of course, important, the authors should ensure the methods in this paper provide the details needed to allow a reader to evaluate the work being presented. As it stands, this is not the case.

      Thank you. As described in detail below, we have now performed rigorous statistical testing using the GofuncR package (Figure 11, Figure 12 and DS7-DS32).

      Overall, my major concern with this work is that the authors make two central claims in this paper (as per the Discussion). The authors claim that Q-rich motifs enhance protein expression. The implication here is that Q-rich motif IDRs are special, but this is not tested. As such, they cannot exclude the competing hypothesis ("N-terminal disordered regions enhance expression").

      In fact, “N-terminal disordered regions enhance expression” exactly summarizes our hypothesis.

      On pages 12-13 and Figure 4 of our preprint manuscript, we explained our hypothesis in the paragraph entitled “The relationship between PEE function, amino acid contents, and structural flexibility”.

      The authors also do not explore the possibility that this effect is in part/entirely driven by mRNA-level effects (see Verma Na Comms 2019).

      As pointed out by the first reviewer, we present evidence that the increase in protein abundance and enzymatic activity is not due to changes in plasmid copy number or mRNA abundance (Figure 2), and that this phenomenon is not affected in translational quality control mutants (Figure 3).

      As such, while these observations are interesting, they feel preliminary and, in my opinion, cannot be used to draw hard conclusions on how N-terminal IDR sequence features influence protein expression. This does not mean the authors are necessarily wrong, but from the data presented here, I do not believe strong conclusions can be drawn. That re-assignment of stop codons to Q increases proteome-wide Q usage. I was unable to understand what result led the authors to this conclusion.

      My reading of the results is that a subset of ciliates has re-assigned UAA and UAG from the stop codon to Q. Those ciliates have more polyQ-containing proteins. However, they also have more polyN-containing proteins and proteins enriched in S/T-Q clusters. Surely if this were a stop-codon-dependent effect, we'd ONLY see an enhancement in Q-richness, not a corresponding enhancement in all polar-rich IDR frequencies? It seems the better working hypothesis is that free-floating climate proteomes are enriched in polar amino acids compared to sessile ciliates.

      We thank this reviewer for raising this point, however her/his comments are not supported by the results in Figure 7.

      Regardless, the absence of any kind of statistical analysis makes it hard to draw strong conclusions here.

      We apologize for not explaining more clearly the results of Tables 5-7 in our preprint manuscript.

      To address the concerns about our GO enrichment analysis by both reviewers, we have now performed rigorous statistical testing for SCD and polyQ protein overrepresentation using the GOfuncR package (https://bioconductor.org/packages/release/bioc/html/GOfuncR.html). GOfuncR is an R package program that conducts standard candidate vs. background enrichment analysis by means of the hypergeometric test. We then adjusted the raw p-values according to the Family-wise error rate (FWER). The same method had been applied to GO enrichment analysis of human genomes (89).

      The results presented in Figure 11 and Figure 12 (DS7-DS32) support our hypothesis that Q-rich motifs prevail in proteins involved in specialized biological processes, including Saccharomyces cerevisiae RNA-mediated transposition, Candida albicans filamentous growth, peptidyl-glutamic acid modification in ciliates with reassigned stop codons (TAAQ and TAGQ), Tetrahymena thermophila xylan catabolism, Dictyostelium discoideum sexual reproduction, Plasmodium falciparum infection, as well as the nervous systems of Drosophila melanogaster, Mus musculus, and Homo sapiens (78). In contrast, peptidyl-glutamic acid modification and microtubule-based movement are not overrepresented with Q-rich proteins in Stentor coeruleus, a ciliate with standard stop codons.

      Recommendations for the authors:

      Please note that you control which revisions to undertake from the public reviews and recommendations for the authors.

      Reviewer #1 (Recommendations For The Authors):

      The order of paragraphs in the introduction was very difficult to follow. Each paragraph was clear and easy to understand, but the order of paragraphs did not make sense to this reader. The order of events in the abstract matches the order of events in the results section. However, the order of paragraphs in the introduction is completely different and this was very confusing. This disordered list of facts might make sense to an expert reader but makes it hard for a non-expert reader to understand.

      Apologies. We endeavored to improve the flow of our revised manuscript to make it more readable.

      The section beginning on pg 12 focused on figures 4 and 5 was very interesting and highly promising. However, it was initially hard for me to tell from the main text what the experiment was. Please add to the text an explanation of the experiment, because it is hard to figure out what was going on from the figures alone. Figure 4 is fantastic, but would be improved by adding error bars and scaling the x-axis to be the same in panels B,C,D.

      Thank you for this recommendation. We have now scaled both the x-axis and y-axis equivalently in panels B, C and D of Figure 4. Error bars are too small to be included.

      It is hard to tell if the key variable is the number of S/T/Q/N residues or the number of phosphosites. I think a good control would be to add a regression against the number of putative phosphosites. The sequences are well designed. I loved this part but as a reader, I need more interpretation about why it matters and how it explains the PEE.

      As described above, we have shown that the number of phosphorylation sites in the Q-rich motifs is not relevant to their autonomous protein expression-enhancing (PEE) activities.

      I believe that the prevalence of polyX runs is not meaningful without normalizing for the background abundance of each amino acid. The proteome-wide abundance and the assumption that amino acids occur independently can be used to form a baseline expectation for which runs are longer than expected by chance. I think Figures 6 and 7 should go into the supplement and be replaced in the main text with a figure where Figure 6 is normalized by Figure 7. For example in P. falciparum, there are many N-runs (Figure 6), but the proteome has the highest fraction of N’s (Figure 7).

      Thank you for these suggestions. The three figures in our preprint manuscript (Figures 6-8) have been moved into the supplementary information (Figures S1-S3). For normalization, we have provided four new figures (Figures 7-10) in our revised manuscript.

      The analysis of ciliate proteomes was fascinating. I am particularly interested in the GO enrichment for “peptidyl-glutamic acid modification” (pg 20) because these enzymes might be modifying some of Q’s in the Q-runs. I might be wrong about this idea or confused about the chemistry. Do these ciliates live in Q-rich environments? Or nitrogen rich environments?

      Polymeric modifications (polymodifications) are a hallmark of C-terminal tubulin tails, whereas secondary peptide chains of glutamic acids (polyglutamylation) and glycines (polyglycylation) are catalyzed from the γ-carboxyl group of primary chain glutamic acids. It is not clear if these enzymes can modify some of the Q’s in the Q-runs.

      To our knowledge, ciliates are abundant in almost every liquid water environment, i.e., oceans/seas, marine sediments, lakes, ponds, and rivers, and even soils.

      I think you should include more discussion about how the codons that code for Q’s are prone to slippage during DNA replication, and thus many Q-runs are unstable and expand (e.g. Huntington’s Disease). The end of pg 24 or pg 25 would be good places.

      We thank the reviewer for these comments.

      PolyQ motifs have a particular length-dependent codon usage that relates to strand slippage in CAG/CTG trinucleotide repeat regions during DNA replication. In most organisms having standard genetic codons, Q is encoded by CAGQ and CAAQ. Here, we have determined and compared proteome-wide Q contents, as well as the CAGQ usage frequencies (i.e., the ratio between CAGQ and the sum of CAGQ, CAGQ, TAAQ, and TAGQ).

      Our results reveal that the likelihood of forming long CAG/CTG trinucleotide repeats are higher in five eukaryotes due to their higher CAGQ usage frequencies, including Drosophila melanogaster (86.6% Q), Danio rerio (74.0% Q), Mus musculus (74.0% Q), Homo sapiens (73.5% Q), and Chlamydomonas reinhardtii (87.3% Q) (orange background, Table 2). In contrast, another five eukaryotes that possess high numbers of polyQ motifs (i.e., Dictyostelium discoideum, Candida albicans, Candida tropicalis, Plasmodium falciparum and Stentor coeruleus) (Figure 1) utilize more CAAQ (96.2%, 84.6%, 84.5%, 86.7% and 75.7%) than CAAQ (3.8%, 15.4%, 15.5%, 13.3% and 24.3%), respectively, to avoid the formation of long CAG/CTG trinucleotide repeats (green background, Table 2). Similarly, all five ciliates with reassigned stop codons (TAAQ and TAGQ) have low CAGQ usage frequencies (i.e., from 3.8% Q in Pseudocohnilembus persalinus to 12.6% Q in Oxytricha trifallax) (red font, Table 2). Accordingly, the CAG-slippage mechanism might operate more frequently in Chlamydomonas reinhardtii, Drosophila melanogaster, Danio rerio, Mus musculus and Homo sapiens than in Dictyostelium discoideum, Candida albicans, Candida tropicalis, Plasmodium falciparum, Stentor coeruleus and the five ciliates with reassigned stop codons (TAAQ and TAGQ).

      Author response table 1.

      Usage frequencies of TAA, TAG, TAAQ, TAGQ, CAAQ and CAGQ codons in the entire proteomes of 20 different organisms.

      Pg 7, paragraph 2 has no direction. Please add the conclusion of the paragraph to the first sentence.

      This paragraph has been moved to the “Introduction” section” of the revised manuscript.

      Pg 8, I suggest only mentioning the PFDs used in the experiments. The rest are distracting.

      We have addressed this concern above.

      Pg 12. Please revise the "The relationship...." text to explain the experiment.

      We apologize for not explaining this topic sufficiently well in our preprint manuscript.

      SCDs are often structurally flexible sequences (4) or even IDRs. Using IUPred2A (https://iupred2a.elte.hu/plot_new), a web-server for identifying disordered protein regions (88), we found that Rad51-NTD (1-66 a.a.) (1), Rad53-SCD1 (1-29 a.a.) and Sup35-NPD (1-39 a.a.) are highly structurally flexible. Since a high content of serine (S), threonine (T), glutamine (Q), asparanine (N) is a common feature of IDRs (17-20), we applied alanine scanning mutagenesis approach to reduce the percentages of S, T, Q or N in Rad51-NTD, Rad53-SCD1 or Sup35-NPD, respectively. As shown in Figure 4 and Figure 5, there is a very strong positive relationship between STQ and STQN amino acid percentages and β-galactosidase activities. (Page 13, lines 5-10)

      Pg 13, first full paragraph, "Futionally, IDRs..." I think this paragraph belongs in the Discussion.

      This paragraph is now in the “Introduction” section (Page 5, Lines 11-15).

      Pg. 15, I think the order of paragraphs should be swapped.

      These paragraphs have been removed or rewritten in the “Introduction section” of our revised manuscript.

      Pg 17 (and other parts) I found the lists of numbers and percentages hard to read and I think you should refer readers to the tables.

      Thank you. In the revised manuscript, we have avoided using lists of numbers and percentages, unless we feel they are absolutely essential.

      Pg. 19 please add more interpretation to the last paragraph. It is very cool but I need help understanding the result. Are these proteins diverging rapidly? Perhaps this is a place to include the idea of codon slippage during DNA replication.

      Thank you. The new results in Table 2 indicate that the CAG-slippage mechanism is unlikely to operate in ciliates with reassigned stop codons (TAAQ and TAGQ).

      Pg 24. "Based on our findings from this study, we suggest that Q-rich motifs are useful toolkits for generating novel diversity during protein evolution, including by enabling greater protein expression, protein-protein interactions, posttranslational modifications, increased solubility, and tunable stability, among other important traits." This idea needs to be cited. Keith Dunker has written extensively about this idea as have others. Perhaps also discuss why Poly Q rich regions are different from other IDRs and different from other IDRs that phase-separate.

      Agreed, we have cited two of Keith Dunker’s papers in our revised manuscript (73, 74).

      Minor notes:

      Please define Borg genomes (pg 25).

      Borgs are long extrachromosomal DNA sequences in methane-oxidizing Methanoperedens archaea, which display the potential to augment methane oxidation (101). They are now described in our revised manuscript. (Page 15, lines 12-14)

      Reviewer #2 (Recommendations For The Authors):

      The authors dance around disorder but never really quantify or show data. This seems like a strange blindspot.

      We apologize for not explaining this topic sufficiently well in our preprint manuscript. We have endeavored to do so in our revised manuscript.

      The authors claim the expression enhancement is "autonomous," but they have not ruled things out that would make it not autonomous.

      Evidence of the “autonomous” nature of expression enhancement is presented in Figure 1, Figure 4, and Figure 5 of the preprint manuscript.

      Recommendations for improving the writing and presentation.

      The title does not recapitulate the entire body of work. The first 5 figures are not represented by the title in any way, and indeed, I have serious misgivings as to whether the conclusion stated in the title is supported by the work. I would strongly suggest the authors change the title.

      Figure 2 could be supplemental.

      Thank you. We think it is important to keep Figure 2 in the text.

      Figures 4 and 5 are not discussed much or particularly well.

      This reviewer’s opinion of Figure 4 and Figure 5 is in stark contrast to those of the first reviewer.

      The introduction, while very thorough, takes away from the main findings of the paper. It is more suited to a review and not a tailored set of minimal information necessary to set up the question and findings of the paper. The question that the authors are after is also not very clear.

      Thank you. The entire “Introduction” section has been extensively rewritten in the revised manuscript.

      Schematics of their fusion constructs and changes to the sequence would be nice, even if supplemental.

      Schematics of the fusion constructs are provided in Figure 1A.

      The methods section should be substantially expanded.

      The method section in the revised manuscript has been rewritten and expanded. The six Javascript programs used in this work are listed in Table S4.

      The text is not always suited to the general audience and readership of eLife.

      We have now rewritten parts of our manuscript to make it more accessible to the broad readership of eLife.

      In some cases, section headers really don't match what is presented, or there is no evidence to back the claim.

      The section headers in the revised manuscript have been corrected.

      A lot of the listed results in the back half of the paper could be a supplemental table, listing %s in a paragraph (several of them in a row) is never nice

      Acknowledged. In the revised manuscript, we have removed almost all sentences listing %s.

      Minor corrections to the text and figures.

      There is a reference to table 1 multiple times, and it seems that there is a missing table. The current table 1 does not seem to be the same table referred to in some places throughout the text.

      Apologies for this mistake, which we have now corrected in our revised manuscript.

      In some places its not clear where new work is and where previous work is mentioned. It would help if the authors clearly stated "In previous work...."

      Acknowledged. We have corrected this oversight in our revised manuscript.

      Not all strains are listed in the strain table (KO's in figure 3 are not included)

      Apologies, we have now corrected Table S2, as suggested by this reviewer.

      Author response table 2.

      S. cerevisiae strains used in this study

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #2 (Recommendations For The Authors):

      While the details are mostly well-explained, I think that the authors could better bring forth the goals and potential usages of hippocampome.org overall.

      I think that this is a great and helpful tool that can leverage various and detailed cellular experimental studies that are out there in the literature to garner potential insights, direct future experimental studies, observe/classify experimental 'differences' (e.g., the deep and superficial pyramidal studies they mention) and so on. Say that one gets some mechanistic insight from more abstract theoretical models, hippocampome can be used to determine whether the experimental data where available is supportive of the theory. They also describe CA3 model and grid cells. While I am not suggesting that the authors completely re-organize the manuscript, I did feel that the last section 'potential applications...' could have perhaps been brought forth earlier (in a summarized form) for the reader/user to better appreciate hippocampome - indeed it is line 288 that should be near the beginning of the paper I thought.

      We thank the Reviewer for the suggestion. We have now included a summary of the simulation readiness of Hippocampome.org in the Introduction.

      I thought the 'application' paragraph (starting line 288) needed expansion to appreciate - I did not have a chance to look at the cited papers in that section - but maybe 2 paragraphs, one on CA3 and the other on grid cells, with a few more sentences of goal/context and tool usage details could be provided?

      We thank the Reviewer for the suggestion. We have added expanded paragraphs describing the simulation work on CA3 and grid cells.

      The authors start their Discussion by mentioning other resources (e.g. blue brain) in comparison. I thought that this was not too helpful without a bit more expansion about these other resources and what in particular is comparable. For example, the blue brain project is different in that it does not mine the literature per se (I think)? But then I am not sure of the extent of the comparison that the authors intend with blue brain and the other mentioned resources.

      Thank you for the helpful suggestion. We have now expanded upon the paragraph to draw more explicit parallels and contrasts among the various projects, in particular between the Blue Brain Project and Hippocampome.org.

      Minor comments

      • Fig 3D caption missing

      Thank you for pointing this out. We have now amended the figure caption.

      • Fig 5A line 211-12 refers to v2.0 but Fig 5 caption says v1.0?

      We apologize for the confusion. We have now added text clarifying the V1.X relevant descriptions around Figure 5.

      • Fig 6A confusing with thin and thick arrows and direction?

      We apologize for the confusion. We have re-colored the thick arrows orange to emphasize the fact that they are feeding directly into the spiking neural simulations.

      • Line 260 - not sure what this means - how is importance defined?

      We apologize for the confusion. We have now added text clarifying that “importance” refers to the role the neuron type plays in the functioning circuitry of the hippocampal formation.

      • CARLsim vs Brian/NEST in choosing - maybe a sentence or two for rationale

      Thank you for the suggestion. We have now added a sentence explaining the selection of CARLsim. CARLsim was selected due to its ability to run on collections of GPUs. CARLsim was the only simulator with this capability at the time the simulation work was being planned, and the power of a GPU supercomputer was needed to simulate the millions of neurons that comprise a full simulation of the complete hippocampal formation.

      • Fig 9 mv should be mV, and the voltage values specified there refer to which dash?

      Thank you for pointing these situations out. We have amended the millivolts label and have made changes to the figure to help clarify which specific tick marks are being labeled.

      Reviewer #3 (Recommendations For The Authors):

      Compliments to the authors on this nicely organized and structured presentation of V 2.0 of hippocampome.org. The paper is well prepared giving a useful short summary of the history of hippocampome for the newcomers and refreshing the memory of users, switching to highlighting the new data additions, why these are relevant and how these complement the existing database, and opening up to new applications. The added potential is well illustrated and in addition, the authors provide numerical information on the usage of this amazing resource. I enjoyed roaming around in the new version, which was made available for reviewers, and although it has been a while since I worked with the system, the new version is easy to work with. I have not had the time to use it extensively so cannot comment in detail but based on the long experience of the authors and their support team, I trust that version 2 will be almost not completely flawless; however that will for sure become clear when it is released.

      One could always wish for more, disagree, or even criticize choices made to cluster neurons, divide areas, and so forth, though in my view that does not contribute to what the resource has to offer. Having said this, the authors might consider addressing briefly issues about differences in the nomenclature used in original descriptions and how they handled the translation into their nomenclature. To mention one that is constantly being debated: how does one define the border between SMo and SMi.

      Thank you for the suggestion. We have added text to the Introduction that addresses the nomenclature issue, as presented in Hamilton et al. (2017), and provide a definition for SMo and SMi.

      Another confusing issue is presented by layers in the entorhinal cortex or its subdivisions (how many and how are these defined). So, some remarks for newcomers in the field who might use the database without spending too much energy to read the original data, might be useful.

      Thank you for the suggestion to clarify this situation pertaining to the entorhinal cortex. Often, we have assumed the authors’ own definitions of the layers and subdivisions (medial and lateral), when naming neuron types. When our name is a hybrid of two published names that include both medial and lateral neurons, our name is prefixed by a simple EC, rather than by MEC or LEC.

      As noted, the authors present version 2 nicely and comprehensibly and I have only a few additional comments, meant to further improve the already high quality of the paper.

      1) The figures, nice as they are, are incredibly information-dense, so they require serious study to get the details; the legends do help, but the many abbreviations coming from totally different fields make it challenging to keep track of them while reading. This is a pity since there is a lot of new information in this version of the dataset, compared to previous versions and the authors overall succeed in emphasizing what is new and why this might be of use/importance.

      So a few suggestions: i) add relevant/most important abbreviations to the legends of the individual figures; ii) introduce all abbreviations upon first use and do not simply refer to the table in the methods. Interestingly, even the authors lose track in the introduction where they use BICCN in line 43 and refer to the abbreviation list, though the full name is given two lines below.

      We apologize for the confusion. We have amended the main text to clarify abbreviations. We have added the abbreviation definitions to the captions of the figures, and in some instances, removed the abbreviations from the figures altogether where space allowed.

      2) Figure 3 and even more so figure 5 depend strongly on the color differences red/green; please change since generally red/green is no longer used for obvious reasons.

      Thank you for pointing this out. We have switched the fonts in Figure 3 to black (excitatory) and gray (inhibitory) to match our previous publication. We have also changed the color schemes in Figure 5 to avoid red and green.

      Reviewer #3 commented on the complexity of our figures and how the figures are information dense. To partially address this, we have decided to remove panel A2 of Figure 3. It was originally meant to emphasize where the information came from to add new axonal projections to two v1.0 neuron types; however, it is not necessary to make the point in the illustration. Thus, we have removed the panel and amended the caption for Figure 3A to include the cited reference.